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THE KUHLMANN-ANDERSON INTELLIGENCE TESTS 
COMPARED WITH SEVEN OTHERS 


F. KUHLMANN 


Director, Division of Research, Minnesota State Department of Public 
Institutions 


The Kuhlmann-Anderson intelligence tests consist of a series 
of thirty-five tests, so adjusted in difficulty that the average 
five-year-old child will pass one item or trial in the easiest of 
the thirty-five, and the average adult will not pass all the trials 
in any of the last twelve tests on the upper end of the scale. 
These thirty-five tests are grouped into eight overlapping 
batteries of ten tests each for grades I to VI, with the seventh 
battery of ten tests for grades VII and VIII, and an eighth 
battery of twelve tests for grades IX and above. The tests 
are standardized by computing the average age of non-selected 
public school children who pass one trial and no more, two 
trials, and no more, and so on, of each test. This weights the 
raw scores in terms of age ability. In scoring, the number of 
trials a child passes on each test in the battery is converted 
into a mental age, and the median of these ten or twelve mental 
ages is taken as his mental age score.! 

The procedure in producing these tests, in weighting the raw 
scores, and in scoring is fundamentally different from that of 
any other tests on the market. This study aimed to get some 
comparison of these tests with a number of others among those 
most familiar to users of group tests. The others chosen were 
as follows: Pintner-Cunningham Primary (P.-C.) given in 
grades I and II. Otis Group Intelligence scale, Primary, 


1 For further details, see ‘“‘Intelligence Tests for Ages Six to Maturity. 
Instruction Manual,’”’ by F. Kuhlmann and Rose Anderson, 1927, 
published by Educational Test Bureau, Minneapolis, Minn. Also, “A 
Median Mental Age Method of Weighting and Scoring Mental Tests.” 
F. Kuhlmann, Journ. Applied Psychol., June, 1927. 
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Form A, (O0.-G.) given in grades III and IV. National In- 
telligence Tests, Scale A, (N), given in grades V to VIII. Otis 
Self-Administering Tests of Mental Ability (Higher Examina- 
tion, with 30 minutes time limit), (O. S.-A.) given in half of 
each grade in grades IX to XII.‘ Terman Group Test of 
Mental Ability, (T) given in the other half in grades IX to XII. 
These several scales of tests were used to test the entire school 
population, about 1400, of one town, (A). The Kuhlmann- 
Anderson tests (K.-A.) were given to the same children a few 
days before or after one of the others was given. The schedule 
was so arranged that to half of each grade the Kuhlmann- 
Anderson tests were given first, while to the other half one of the 
other tests was given first. In addition to this, another com- 
parison was made by giving the Detroit Primary Intelligence 
test, Form C, (D.-P.) to about one hundred children in each 
of grades II, III, and IV, and the Detroit Alpha Intelligence 
test, Form M, (D.A.) to the same number in each of grades 
V to VIII, in another and much larger town, (B). The classes 
tested in each of these grades were selected, and cannot be 
taken as representative of the whole school population. In this 
comparison the Kuhlmann-Anderson tests were given two to 
three weeks after the Detroit test in all instances.* 

The number of children enrolled in the grades and classes to 
which the tests were given is shown in the following: 














Dy es ee | |.” | oem 
begat! ach. data Ten 
[a jm |urjiv| v {vi | virjvit ¢ | 
le tig ae oh San ae es 
i Ge oe, Se des BG 
——|—}— |——|— |} — —|— | | |— 
Town A.................| 139} 112} 108} 115) 104} 89) 95} 99| 221| 191 
Thee Basis ceed mT | 103) 103| 98} 1 118) 106| 108} 107| | 











?The Kuhlmann-Anderson tests were in all instances given by 
Charlotte Lowe and Genette Ulvin, psychologists of the Research 
Division, Minnesota State Department of Public Institutions. Both 
have had extensive experience in individual and group test examinations. 
They also gave all the other group tests used in the first comparison. In 
town B the Detroit tests were given by Dr. E. L. Norton, Assistant 
Director of Research, Board of Education, St. Paul, Minn. 
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The number to whom the tests were given never varied more 
than the few that were due to absences, and space will not be 
taken in the various tables that follow to give the exact number 
in each instance. 

Our tests embody certain general ideas and principles, and 
inasmuch as we wish to see how these work out in comparison 
with other tests, I shall present the results as much as possible 
in direct relation to these principles. Most readers will recall 
some earlier criticism over the fact that so little theory or 
statement of general principles accompanied the appearance of 
the original Binet-Simon Scale in 1908. Since then we have 
had plenty attempts to supply this lack, but it has been of a 
statistical rather than of a psychological nature. I shall 
attempt to go behind the questions of “validity” and ‘‘re- 
liability,’’ and consider some of the factors that may make a 
test good or bad. 


A. DIFFERENT TESTS FOR DIFFERENT MENTAL LEVELS 


Mental test results disprove any view of mental development 
as primarily a quantitative increase from year to year in one 
and the same thing. Tests that will discriminate the lowest 
successive levels may have something in common with those 
required to discriminate between higher levels of development, 
but that common element, if discoverable at all, is not sig- 
nificant or important in describing or in measuring the differ- 
ences between lower and higher. 


1. Effect of difficulty of a test on its correlation with a criterion 


The same tests used at successive ages or school grades must, 
of course, become increasingly easier for the higher mental 
levels. To determine the influence of this on the merits of the 
tests at different levels, correlations (by ‘four-fold table’’ 
method) were computed between the scores on each test and 
the total or median score on the whole battery of tests. This 
was done for each school grade separately. 

In the Kuhlmann-Anderson test batteries for successive 
school grades the tests in each battery are arranged in order of 
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difficulty from first to last. Having computed the correlations 
between the mental ages earned on each separate test and the 
median mental ages on the battery, separately for the results 
from each school grade, we averaged these correlations for the 
first three tests in all the batteries for grades III to VII, for 
the middle four tests in the batteries, and for the last three tests 
in the batteries. This gives an average for fifteen correlations 
for the easy tests, the first three tests in each of the five bat- 
teries, an average for twenty correlations for tests of median 
difficulty, the middle four tests in five batteries, and an average 
for fifteen correlations for the relatively difficult tests, the last 
three tests in each of five batteries. These averages were as 
follows: 





FIRST THREE TESTS | MIDDLE FOUR TESTS | LAST THREE TESTS 





0.631 | 0.648 | 0.680 





Each of the tests numbered 16 to 25 appears in three suc- 
cessive test batteries. In the first battery in which it appears 
it is, therefore, relatively difficult for the age level where the 
test is used. In the second battery in which it appears it is of 
median difficulty and in the third battery it is relatively easy. 
Taking these correlations for these ten tests and averaging 
them separately for the first, second, and third battery in which 
they appear, gives the following: 





FIRST BATTERY SECOND BATTERY THIRD BATTERY 








0.745 0.672 0.619 





When the K.-A. tests were not yet in their present form, we 
had fifteen tests per battery for the successive ages and grades. 
The raw scores were at that time transn: ‘ted into sigma scores, 
and correlations (by Pearson Product-Moments Method) 
were computed between the sigma scores on the single tests 
and the total sigma score on the battery, separately for the 
results for each age. The tests were then not as well arranged 
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in order of difficulty as now. But, averaging these correlations 


for the first, middle, and last portions of the batteries gave the 
following: 





FIRST PART MIDDLE PART LAST PART 





0.514 0.593 0.605 











A distinction has often been made between so-called ‘“‘speed”’ 
and “‘power” tests. The speed test is essentially one in which 
the successive trials or items are equal in difficulty and rela- 
tively easy of performance for the mental level at which the test 
is used, so that a high score on it depends chiefly on the amount 











TABLE 1 
GRADE GRADE GRADE GRADE 
v vi vi vur 
N Average for 5 tests............ 0.81 | 0.72 | 0.76 | 0.68 
’ iA nich ainbenss oe trend sews 0.49 | 0.36 | 0.44 | 0.28 
K.-A icine kidney tsa ee 0.77 | 0.61 | 0.55 | 0.66 
mem sre Fe -L2 o ey 0.54 | 0.33 | 0.32 | 0.51 














of effort put into doing a thing fast. The power test is essen- 
tially one in which the items are relatively difficult of per- 
formance, so that a high score on it depends chiefly on the 
presence of mental traits that are necessary for its performance 
at any speed. The speed test can be done at lower mental 
levels, at a slower rate. The power test cannot be done at all 
at lower levels. To a high degree, all tests are power tests 
when used at low enough mental levels, and become speed 
tests when used at high enough mental levels. Test number 5 
in the National, and test number 23 in the Kuhimann-Anderson 
are speed tests. Correlations between scores on these and the 
total scores on the battery are very much lower than the cor- 
relations for the other tests, at every school grade. In tableI 
the second line gives the average correlations, for the five tests 
in the National battery, between scores on each test and total 
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scores on the battery. The third line gives these correlations 
for test 5. In the last two lines in the table, the same results 
are given for the Kuhlmann-Anderson tests. 

In all these comparisons it appears that a mental test becomes 
increasingly less valuable as it becomes easier of performance 
if we take these correlations alone as evidence. Of course, 
there must be a limit when going in the other direction. A test 
cannot continue to improve in value the more difficult it 
becomes. But we should expect these correlations to be highest 
at a given age level and then decrease at age levels above and 
below this. Unfortunately when we compare these correla- 
tions for a given battery of tests applied in different school 
grades, complicating factors enter which cover up whatever 
influence there may be of a decreasing difficulty of a test. One 
of these factors is the increasing spread in the distribution of 
the mental ages, as we pass from lower to higher grades, which 
raises the correlations as we go up to higher school grades. A 
second factor which is undoubtedly present is the increasing 
variability with increasing age of each mental trait that each 
mental test is measuring, this variability being more or less 
independent of the variability of the mental ages as a whole. 
A third factor is the varying percentage of zero scores on par- 
ticular tests. While the first factor would increase the cor- 
relations between the scores on a test and the total scores on the 
battery, the second and third factors would decrease these cor- 
relations. With these two is combined the fourth, the influence 
of decreasing of difficulty of the test. Nevertheless, when all 
the results on these matters are considered together, the 
presence of this fourth factor will be shown in a marked degree. 
But before giving further results on correlations in connection 
with the effect of ease or difficulty of a test on its value, let us 
see how ease and difficulty affect the variability on the raw 
scores of a test. 


2. Effect of difficulty of test on variability of raw score 


For each test in each battery the variability in the raw scores 
was found by computing the average raw score on the test, 
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finding the average deviation, and then dividing the average 


4.) in order to make this measure 


of variability more comparable at all points. This was done 
separately for the results in lower and higher school grades, 
with the exceptions of the P.-C. and K.-A. tests for town A, 
where the large number of zero scores affected the averages too 
much, and the Otis Self-Administering because of its single 
series of items or trials not grouped into different tests. The 
results, which are given in table 2, were quite contrary to 
expectation. This variability decreases most markedly as 
we pass from lower to higher school grades in all but the K.-A. 
test results. There is a drop in variability also for the K.-A. 
tests in grades II to IV. Beyond this the variability remains 
about the same. In table 2 the first line gives the order of the 
test in the battery. 

The Roman numerals on the left designate the school grades. 


deviation by the average, ( 


The figures in the body of the table give the <>. for each 


separate test, and the averages of these for each grade and 
battery are given in the last column on the right. These 
averages on the right are higher for the K.-A. tests in seven of 
the nine comparisons in the table. The exceptions are in the 
comparison with the Detroit in grade II, and the Detroit in 
grade V. This decrease in the variability of the raw scores as 
we go up in the grades must be due either to a decrease in the 
actual variability of the children found in the upper grades, 
or to a decrease in the ability of the tests to measure the 
variability that exists in the children. Variability in mental 
age increases as we go up in the grades quite as markedly as 
variability in raw scores decreases, as will be seen later. But 
this may be accounted for by the fact that the intervals between 
successive mental ages become smaller the higher the mental 
age, and by the smaller yearly increase in raw score that repre- 
sents a full year increase in mental age at the higher levels. 
The variability in the IQ was computed for the K.-A. test 
results in the same way as for the raw score. This gave the 
following (table 3). 
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This does not indicate any significant change in variability in 
IQ from lower to upper grades. 

On the other hand there is conclusive evidence that the 
variability in raw score is directly related to the difficulty of 
the test. In the K.-A. batteries the tests are arranged in 
order of difficulty from first to last in each battery, difficulty 
being measured by the average age of non-selected school 
children who pass one trial and no more in the test. The 


following gives the average ae 


the second, difficult half of the K.-A. batteries. Table 4 is. 
derived from table 2. 


for the first, easy half, and 


TABLE 3 





GRADE | GRADE | GRADE | GRADE | GRADE 
mI Iv Vv vI vir 





0.10 | 0.09 | 0.11 | 0.09 | 0.08 

















GRADE 
vit 


.35 | 0.26 
Town A { 40 | 0.32 





First half 0.34 ; 7 0.21 
Last half 0.49 r 2 0.27 


Town B 























In each of the nine comparisons here the variability is 
markedly higher for the tests of the last and more difficult 
half of the battery. 

We may infer from this that the decrease in the variability in 
raw scores in the upper grades is due largely to the tests having 
become easier in these grades. The K.-A. tests have avoided 
most of this by using more difficult tests for each higher grade. 
Since the more difficult tests correlate higher with a criterion 
than do the easier ones, one is tempted to conclude, further, 
from this alone that the tests have become poorer in the higher 
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grades where they have become easier. But the proof of this 
would lie not in these higher correlations themselves, which 
would be higher because of the greater variability of scores, 
but in the reason why this variability changes. 

As to why the raw scores become less variable as the tests 
become easier is more or less a matter of speculation. Two 
factors suggest themselves. The first is inherent in the nature 
of the test. Speed of performance in many tests reaches a 
maximum because of mechanical limitations, which may be 
association time, hand or eye movement, or other such process 
involved. When a test has become so easy that the majority 
of children in the grade tested approach this mechanical limita- 
tion, both dull and bright children in the group will make more 
nearly the same score. The second factor lies not directly 
in the test itself, but in the reaction of the children toit. There 
is probably a considerable general tendency for children to relax 
in effort in proportion to the ease of the task. Ifa child has to 
try hard to pass any or only a few trials in a test he will make the 
necessary effort to do so. If he can pass quite a number of 
trials without much effort he will not work at his maximum. 
This results again in a reduction of the difference in scores 
between dull and bright. For, in the higher grade the test has 
become too easy for the brighter child, but not so much too easy 
for the duller one. 


3. Middle point in the age norms for tests 


The middle point in the age norms for a battery of tests 
should give a rough indication of the age level where the tests 
fit best and where, therefore, they should correlate highest with 
a criterion. If, for example, a battery yields age norms for 
ages five to eleven years, we may assume that this battery is 
likely to give the most reliable results near the age of eight, 
except for two complicating factors. One is the decreasing 
rate of mental development with increasing age, which would 
lower this middle point. The other is the fact that the separate 
trials in such test are not of uniform difficulty, but in most cases 
increase in difficulty. This raises the middle point. In table 
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5 the ages given under “Best Age”’ indicate the age levels where 
the tests fit best, according to the middle points in the raw 
scores given as age norms in the instruction manuals for the 
tests. The middle points between lowest and highest age 
norms were computed, and the ages are those corresponding to 
these middle points in raw scores. The K.-A. tests are so 
adjusted that each battery fits best the exact age for which 
the battery is recommended, the first battery being for age 
7-0, intended for first grade, because in the majority of in- 
stances children of the first grade are nearer seven than six 
at the time they are tested. 




















TABLE 5 
| BEST BEST names | MENTAL came 
AGE GRADE RECOMMENDED | COVERED 
Pintner-Cunningham......... | 6-9 I+} Kgn.-IlI 2-6 to 12-0 
Detroit Primary............. 9-1 III II-IV | 40 to 15-0 
Otis Group, Primary........ 9-3 III I-IV | 3-6 to 15-0 
REGS ce atek ches thous: 11-5 | V+ III-VIIIL| 4-6 to 21-0 
Detroit Algha: .i 0. 688 51 13-7 | VIII | V-IX | 5-6 to 210 
‘Tien Greens... 005 cies 14-3 IX VII-XII | 6-6 to 27-0 











Under “Best Grade”’ are given the grades in which the ages 
would correspond closest to “Best Age.” Under ‘Range 
Recommended” are given the grades for which the publishers 
recommended the tests. Under “Mental Ages Covered” are 
given the ranges in mental ages that may be found in the grades 
for which the tests are recommended, assuming that intelli- 
gence quotients may range from 50 to 150 in each grade, and 
assuming an exact age of seven for grade I, eight for grade II, 
andsoon. The percentage of school children that run to these 
extremes in IQ’s is small, and promotion is, of course, in a 
considerable measure according to ability instead of by chrono- 
logical age alone. This improves the situation. But, after 
making liberal allowances, it is obvious that a very great burden 
is put on any single battery of tests when applied to several 
successive grades. 
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Judging from the data in table 5, it is seen that the grades 
for which each test battery is recommended extend quite 
equally above and below the grade where the battery fits best 
in all instances except the Detroit Alpha. The Detroit Alpha 
fit best grade VIII, but are recommended for grades V to IX. 


4. Change in correlations from grade to grade 


We refer to the correlations between the scores on a test and 
the total scores on the whole battery. Do these change as 
the battery of tests is given to successive school grades, and are 
they highest for the grade where the tests fit best, best as 
judged by the middle points in the age norms? Table 6 gives 
the data on these correlations, and on three of the disturbing 
factors, which were the average mental level of the children 
in the different grades, the precentage of zero scores on par- 
ticular tests, and the variability of the mental levels of the 
children in each grade as measured by the standard deviation 
of the mental ages taken in months. In the third column from 
the left are given the average mental ages for the grades as 
found by the tests in question in each case. In the fourth 
column are repeated the ages where the tests fit best, as given in 
table 5. In the fifth are given the percentage of scores on 
separate tests that were zero. In the sixth are the 8.D.’s 


of the mental ages in months. In the last are the averages of | 


the correlations of scores on each test with the total scores on 
the battery. 

Before discussing the main question about this table, let me 
note that the absolute values of the correlations cannot be 
used as a basis for comparing the relative merits of the tests in 
different batteries. We are interested only in the changes in 
those correlations from grade to grade. Their absolute magni- 
tude depends on the grade in which they are used, the number of 
tests in the battery, and on the method of scoring, and on other 
factors. They are relatively quite high for the National tests 
partly because there are only five tests in the battery, and the 
score on each test had a proportionally much larger share in 
determining the total score on the battery. The correlations 
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for the K.-A. tests were relatively reduced both because they 
have the largest number of tests per battery, and especially 
because the total score on the battery is here replaced by the 























TABLE 6 
r PER 
TESTS GRADES eae ae i 5.D. Mec 
nec SCORES 

wi I} 65 | 69] 14 | 13.4 | 0.710 
oe BGS ate: PRION SS. ul 7-9 i 9.6 | 0.750 
II 7-4 9-1 16 11.3 0.723 
pO A ee ee oe III 8-11 5 13.8 | 0.733 
IV | 10-8 2 15.1 0.670 
O. Gr III 9-2 9-3 6 9.6 | 0.621 
PU ood nce cen Iv | 10-0 9 93 |0.691 
Vi i1i1-8 13-7 3 18.8 0.740 
VI | 12-10 2 B77 0.751 
Bey Sins cigs sas ens VII | 14-0 1 15.5 | 0.735 
VIII | 15-3 0 17.3 0.794 
V | 10-10 11-5 3 15.8 | 0.814 
Nat VI | 12-1 0 12.6 0.718 
Regt A es, ae eee ae vit | 13-3 0 15.5 | 0.764 
VIII | 14-4 0 14.1 0.676 

I 7-0 31 9.9 

II 8-0 15 9.1 
III; 88 90} 13 8.8 | 0.626 
K.-A IV 9-9 10-0 7 9.3 0.701 
eTMibececeeeeeeeeeeneeeeee Vv 10-7 11-0 2 11.7 0.774 
VI | 11-11 12-0 2 9.9 0.614 
VII | 13-2 13-6 1 11.0 | 0.546 
(| VIIL | 13-8 13-6 1 13.4 | 0.660 














median score, and the median is much less affected than is a 
total, by a variation in the score on the separate test. 

In three of the five instances in this table the correlations are 
highest at or very near the age level where the tests fit best, 
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as determined by the middle points in the age norms. The two 
exceptions can be explained by the percentages of zero scores. 
Since the question here is of fundamental importance, I shall 
discuss the data step by step. In the Pintner-Cunningham 
results the best fitting age, 6-9, is four months higher than the 
average mental age of grade I, and twelve months lower than 
the average of grade II. But the correlation is four points 
higher for grade II. This is one of the exceptions, explained by 
the 14 per cent of zero scores in grade I as compared with 1 per 
cent in grade II, which lowered the correlation for grade I. 

In the Detroit Primary results the correlation is highest where 
the tests fit best, grade III. Note that the correlation is 
relatively high for grade II where the tests are the most difficult, 
although both the high percentage of zero scores and the low 
S.D. would tend to lower the correlation. 

In the Otis Group tests we have the second exception. The 
best age is only one month higher than the average mental age in 
grade III, and seven months lower than the average in grade 
IV, but the correlation is higher for grade ITV. Here again we 
have six per cent zero scores in grade III as compared with two 
per cent in grade IV, with the 8.D. approximately the same in 
both grades. 

In the Detroit Alpha the correlations increase from grades 
V to VIII, with a slight exception for grade VI. According to 
the age norms the tests fit best exactly grade VIII in October, 
which was the month in which the tests were given, but the 
average mental ages as found by these tests is a year and eight 
months above normal. On the other hand, it is noted that the 
relatively high S.D.in grades V and VI still leaves the correla- 
tions lowest for these two grades. 

In the National the correlations should have been about the 
same in grades V and VI, according to the average mental ages 
found, and the age where the tests fit best. Apparently the 
S.D. in grade V accounts for the higher correlation for this 
grade. On the whole, there is a very evident decline in the 
correlations as the tests become easier for the higher grades. 

In the K.-A. tests the correlations should remain unaffected 
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by the factor of increasing ease of performance for children of 
higher grades, for the tests as a whole become more difficult 
from grade to grade in proportion to the increase in ability of 
the children in these grades. But there are left the factors of 
change in percentage of zero scores from grade to grade, the 
increase in the spread of the distribution of mental ages, seen in 
the increasing S.D. from grade to grade, and probably an in- 
crease in variability of the particular mental traits that each 
separate test measures. The correlations here, taken as they 
stand, are not conclusive on the question of change from grade 
to grade. Part of their irregularity may perhaps be attributed 
to the fact that for each grade the new tests added to the battery 
and the old ones that were dropped varied in their correlations 
and so disturbed the average correlations for the successive 
grades. But further analysis of the figures reveals the fact 
that, if the other disturbing factors were eliminated, we would 
probably have a consistent decline throughout in the correla- 
tions from the first grade up. Thus, for grades I to III or IV 
the correlations are evidently lowered chiefly by large per- 
centages of zero scores, and partly by the low S.D. For grades 


V to VIII the decrease to a negligible percentage of zero scores 
and the increasing S.D. combined has not been sufficient to 
keep the correlations from going down. This leaves the increas- 
ing variability of the particular traits measured as the remaining 
responsible factor. 


5. Zero and maximum scores 


The matter of zero and maximum scores needs further 
consideration. The median mental age method of scoring was 
adopted for the K.-A. tests chiefly to avoid the undue influence 
of zero and maximum scores, and of other unusual variations 
in the score on some particular test or two inthe battery. Itis, 
of course, easy to so construct tests as to avoid zero and maxi- 
mum scores largely or entirely. We may make the first trials 
so easy that all will be able to do some of them, and then have so 
many trials, or make the allowed time so short that none can 
finish all. This, however, reduces all tests largely to ‘“‘speed’’ 
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tests for the more mentally advanced, on which they can show 
their superiority only in one and the same trait. It also 
increases the possibility of larger variations in the score on any 
particular test alone, which is a fault rather than a virtue, when 
the number of tests in a battery is as small as it must necessarily 
be, and when the final score is the total or average score for 
the several tests in the battery. The better way to avoid zero 
and maximum scores is, of course, to fit the tests to the abilities 
of the children at each age level, so that the tests may operate 
at all levels more as “power’’ tests. It should be said that 
most tests now in use have made a sort of compromise by 
increasing the difficulty of successive trials in the test, as well 
as having a large number of trials beginning with easy ones. 
Just what this does to the test from the standpoint of the mental 
traits involved in doing the test is difficult to say, without care- 
ful analysis of each particular instance. 

In table 7 are given the percentages of zero and maximum 
scores in all the tests used, arranged by grades. They are given 
for the battery as a whole, not for each test separately. Four- 
teen per cent zero scores for the P.-C. tests in grade I means that 
of all the scores on the separate tests for grade I 14 per cent 
were zero. It does not mean that 14 per cent of the children 
had zero scores on some or all of the tests. For example, a 
hundred children and a battery of ten tests would give 1000 
separate scores, 14 per cent of which might be zero. 

In judging these figures the first thing to take into account 
is the fact that for all but the K.-A. tests zero scores are much 
more important than maximum scores. They usually represent 
a much bigger variation from the average of the several scores 
for the separate tests in the battery than do the high scores, 
and unduly penalize the child making them. A maximum 
score, doing correctly all trials in the test, often gives him undue 
credit, as it is, in the mental age he earns thereby, and would 
credit him more unduly if the test had still more trials. Both 
zero and maximum scores, of course, give the child a lower 
total score than he would have received had the test had a wider 
range of trials. 
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The figures show at a glance that zero and maximum scores 
occur much more frequently in the lower than in the higher 
grades. At least three factors combine to produce this result. 
One is the larger difference in mental abilities between two 
successive lower grades than between two successive higher 
grades. The same test battery cannot cover as many of the 
lower as of the higher grades for this reason. Note the large 
changes in the relative percentages of zero and maximum scores 
from grade I to I] and II to III, as compared with these changes 
for higher grades. All authors of tests have recognized this 
fact by letting a battery of tests for the lower grades cover a 
smaller number of grades than they do for tests designed for 
the higher grades. A second factor is the smaller number of 
trials in the tests for the lower grades. For young children 
variety and frequent change of task is needed in mental tests, 
calling for tests with relatively few trials. This fact is taken 
into account in the P.-C. and K.-A. tests. The third factor 
is the greater difficulty of getting uniform attention and effort 
from the younger children. This affects chiefly zero scores. 

Beyond the third grade, at least, it is evident that all tests 
have avoided zero and maximum scores to a very high degree. 
The casual observer would conclude, and has indeed repeatedly 
concluded, that these tests are entirely adequate for reliable 
measurement in the several grades for which they are recom- 
mended. This conclusion, however, does not follow necessarily. 
Besides being reduced largely to speed tests by the multiplica- 
tion of relatively easy trials per test, as noted above, tests that 
do not give many zero or maximum scores when applied to 
several successive school grades may simply lack in discrimina- 
tive eapacity. 

The K.-A. tests give the highest percentage of zero and 
maximum scores in grades I to IV. While this is in itself a 
fault, it is the result chiefly of avoiding the other and more 
serious objectionable features that have been discussed. How- 
ever, for grades I and II, especially, the large percentage of zero 
scores is due partly to the age and mental level of the children. 
The first battery of K.-A. tests is intended for grade I where the 
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average age is usually near seven, but in this case the average 
age was 6-0. The second battery is intended for grade II 
where the average age is usually near 8, but in this case the 
average age was 7-5. Our median mental age method of 
scoring overcomes the influence of zero or maximum scores on 
the mental age a child earns. Taking this into account, the 
error that resulted from zero and maximum scores with the 
K.-A. tests is relatively quite small, as compared with what 
happened with the other tests. I shall attempt to make this 
clearer in connection with the next problem taken up. 


B. WEIGHTING AND SCORING OF TESTS 


1. Weighting of raw scores on each test 


Space will not be taken here to discuss the relative merits of 
different methods of weighting the raw scores of tests. The 
transmuting of the raw scores into mental ages in the K.-A. 
tests weights each item in each test in accordance with the 
amount of ability represented by these mental ages. It does 


not take account of the fact that the year represents a decreasing 
amount of development with increasing age, but this objection 
is mostly eliminated because the score earned on the whole 
battery is not a summation of mental ages involving the adding 
of unequal units. For the most part no weighting of raw 
scores is employed in the other tests of this study. In some 
instances wrong responses are taken into account in scoring, 
and in a few others where the number of trials in the test is 
much smaller or larger than in the others of the battery the 
number of correct responses is multiplied or divided by some 
figure so as to make the score on that test more nearly the same 
for the average case as on the others in the battery. 

Most readers will be familiar with the view sometimes held 
that any weighting of raw scores on the group tests now on the 
market does not materially improve the scoring. This is 
contrary to common sense logic, and a view to which I cannot 
subscribe, “statistical evidence’”’ notwithstanding. The aver- 
age number of trials passed in each test in a battery was com- 
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puted, separately for each grade. These averages and other 
figures are given in table 8. The figures 1 to 8 at the top give 
the order of the tests in the battery. After “Av. 8.” follow 
the average number of trials passed. The percentages given 
show how much of its share each test contributed to the total 
score of the child. For example, in the P.-C. tests in grade I, 
assuming that each test’s share should be one seventh of the 
total score, test 1 is seen to have contributed 155 per cent instead 
of 100 per cent of itsshare. The A.D. column on the right gives 
the average deviations of these percentages. 

This table shows that a number of tests contribute three 
times as much to the total score as do some others in the same 
battery, while contributing twice as much as some others is 
quite common. This is true of the average tendencies of the 
tests, from which it is realized that for any particular child 
some one test may contribute several times as much to the total 
score of the child than some other tests in the same battery 
does. 

The value of any of these tests is, of course, not any larger 
because it contributes a relatively large share to the total score. 
If such relationship exists for any one it is purely accidental. 
To the contrary a test that contributes more than its share to 
the total score, on the average, is a relatively easy test and for 
this reason is usually of less value than the others in the battery, 
depending on whether the battery as a whole is relatively easy 
or difficult for any particular grade under consideration. 


2. Scoring on the battery 


Totalling the trials passed on the several tests in a battery is 
the simplest method of scoring. As compared with the median, 
it has the same advantages or disadvantages as has the average. 
Whether the average or median is the better measure depends 
on the purpose of the measure and the circumstances. In the 
case of group tests, I believe that both the purpose and the 
circumstances give the advantages to the median, as compared 
with the average or total. The purpose is to get a measure 
that is most representative of the child’s general mental abilities, 
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on the basis of which we can best predict success or failure in 
such general fields as public school training, forexample. The 
chief circumstance is the frequent occurrence of extremely high 
or low scores, measures, on a test or two in the battery. When 
the child is in the group, we cannot control his efforts to any 
great extent, and effort fluctuates over a wide range from child 
to child and from moment to moment in any partiuclar child. 
Unusual lack or unusual high capacity to perform some particu- 
lar test or two is the second large cause of exceptional scores. 
Obviously our end will be largely defeated if we allow these 
exceptional scores to influence the score on the battery as a 
whole in proportion to their magnitude. This is what the 
median avoids, and is what the average or total permits. The 
case should need no argument, nor even require demonstration 
with actual results. 


C. DISCRIMINATIVE CAPACITY 


The object of tests is to discriminate between different mental 
levels or grades of intelligence. Obviously, other things being 
equal, that test is best whose scores show the greatest difference 
between two different levels or grades. It has been measured 
in various ways. Throughout the ten years of work on the 
K.-A. tests the chief statistical guide in determining the merits 
of a test was its discriminative capacity, as measured by the 
increase in the percentage of children of a given age that passed 
approximately two thirds of the trials of a test, as compared 
with children a year younger. 


1. Increase in raw scores for successive years 


The easiest obtainable evidence of the discriminative capacity 
of a battery of tests is the increase in the scores from year to 
year, for this is already given in the norm tables. This data is 
given in table 9. The second line gives the corresponding 
grades. S. gives the yearly increase in raw score, and the per 
cent is the yearly percentage increase in raw score. 

These figures all show that every battery of tests gradually 
loses its capacity to discriminate between successive higher 
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levels of mental development. As it does so, it must, of course, 
lose in reliability. This is undoubtedly largely due to the 
fact that the differences in general development from year to 
year become smaller. At the lowest age where each battery 
is applied the discriminative capacity is high for all batteries, 
although the batteries begin at various ages from four to twelve. 
A child passing a few trials more or less on any particular test 
is always largely a matter of accident, due to factors unknown 
or factors over which we have no control. When the increase 
from year to year in score for the whole battery becomes small, 
such accidental variations in the score on any one test in the 
battery may alone add or subtract a year or more in the mental 
age a child earns. This is truer the larger the score and the 
easier the separate trials in the tests. The size of the total 
battery score, the ease of performance of the separate trials, 
and the age level must be considered together in comparing 
the figures for the different batteries. When this is done, we 
may say that the Otis Self-Administering, the Otis Primary and 
perhaps the National rank low in table 9, while the Terman, 
the Kuhlmann-Anderson, and the Detroit Alpha rank high. 


2. Overlapping of scores between adjacent grades 


Ordinarily, when two groups of children known to be of 
different levels of development are tested with two different 
batteries of tests, we may assume that the most reliable battery 
is the one which gives the least overlapping of scores between 
the two groups. This assumes that the two batteries are equally 
capable of giving the same range of low and high scores, and 
that the scores dealt with are the original raw scores, not trans- 
muted scores, such as mental age. With transmuted scores, 
overlapping is affected by errors in standardization, displaced 
age-norms. Overlapping may be measured in different ways. 
In these results it was measured by computing the percentage 
of children in a grade whose mental ages, as found by the tests 
in question, fell below the mental age that was half way between 
the average mental age of the lower grade and the average men- 
tal age of the higher grade. Forexample, let the average mental 
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age for grade IV be 10 years, for the results of a given battery 
of tests. We then compute the percentage of children in 
grade IV whose mental ages fall below 9 years and 6 months. 
The more usual way, consisting of computing the percentage 
in the higher group who fall below the average of the lower, 
and vice versa, was not so applicable here, because these 
percentages approached zero in some instances, and hence were 
less reliable. Table 10 gives the percentages of overlapping, 
computed as described. 


TABLE 10 





GRADES |GRADES |GRADES |GRADES |GRADES | GRADES | GRADES 
I-ll II-1l Ill-Iv Iv-v v-VI VI-VIl | VII-VIlr 





18 
12 


35 35 
29 23 





27 32 
22 25 























Here all comparisons favor the K.-A. tests, excepting the two 
instances for grades VII to VIII. The explanation for the two 
exceptions is not clear. It might be in displaced age norms. 
The figures given below on mental age and IQ distributions 
suggest this possibility. On the other hand, it is of interest to 
note that in the K.-A. tests VII and VIII give the first instance 
where entirely the same battery of tests is used for two succes- 
sive grades. 


3. Standard deviation of mental ages 


The standard deviations computed are of the mental ages 
expressed in months. Some statisticians favor the standard 
deviation as a measure of reliability. It is likely to show 
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the same facts about tests as do measures of overlapping, and 
both are related to discriminative capacity, as discussed above. 
However, conditions are possible that give a low discriminative 
capacity as well as a low 8.D., and poor reliability. Table 11 
gives the results. 


TABLE 11 








| GRADE I 
GRADE II 
GRADE Ilr 
GRADE IV 

| GRADE V 

GRADE VI 

GRADE VII 

| GRADES VIII 

| GRADES IX-XII 


| 


25.1 
20.4 





























17.4 
\22.0 








The table gives seventeen comparisons. Of these fifteen 
favor the K.-A. tests, and in one, the Otis and K.-A. for grades 
IV, the standard deviations are the same. The one exception 
is for the Terman tests for grades IX to XII. 


D. CORRELATIONS WITH OTHER CRITERIA 


All correlations given hereafter were computed by the 
Pearson Product-Moments method, and were between mental 
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ages expressed in months, or mental ages and school marks. 
Since the data for each grade below IX were treated separately, 
all the correlations obtained were necessarily low, as compared 
with those of most other studies. Correlations were computed 
between the mental ages on the K.-A. tests and the correspond- 
ing mental ages on the other tests used on the same group of 
children. The average school mark for the year preceding was 
secured for all children, except those tested with the Detroit 
tests, and those in grades I and IX that were tested by the other 
tests. Correlations were computed between these school marks 
and the mental ages as found by the different tests, except that 
for the results in grades X to XII, where the mental age divided 
by the age that was normal for the grade was substituted for 
the mental age. One hundred eighty-seven months was 
assumed as normal for grade X in October. The results for 
these three grades were lumped together, making this proced- 
ure advisable. 
TABLE 12 





| GRADE III 
GRADE IV 
GRADE V 
GRADE VI 

| GRADE VII 
GRADE VIII 
GRADES IxX-XII 





oS 
SF 
































1. Correlations between results of different batteries 


These correlations usually have meant little or nothing as 
regards the value of the tests involved. Details necessary for 
their interpretation are usually lacking. Could we assume the 
superiority of any one battery to begin with it would seem that 
we might determine the relative merits of others in comparison 
with the superior one. But there are always likely to be so 
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many disturbing factors affecting the correlations that any 
existing difference between two batteries is lost in variations 
in the correlations that cannot be explained. It is so very 
largely with our results, which are given in table 12. 

Each correlation in the table is between the K.-A. tests and 
the one named on the left. On the whole, the correlations are 
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lower for the lower grades. In comparing table 6 with 12 it 
will be seen that there issome agreement. The results from the 
batteries in which the scores on the single test correlated high 
with the total battery score also correlate high with the K.-A., 
when taken grade by grade. For the Detroit Primary the 
correlations are high in both tables for grades II and III, and 
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low in both tables for grade IV. For the Otis Primary, they 
are highest in both tables for grade IV, and lowest in both for 
Ill. For the Detroit Alpha the correlations run: 


0.751 0.735 0.794 
0.70 0.70 0.77 


0.814 0.718 0.764 0.676 
0.78 0.75 0.65 0.64 


When, however, the differences in variability in the mental 
ages in each grade are taken into account, it becomes obvious 
that the differences in the correlations in table 12 were deter- 
mined chiefly by this variability. For each correlation in 
table 12 the two S8.D.’s were averaged and plotted against the 
correlations in the scatter-graph (Graph 1). 

This shows at once that the correlations follow the S.D. very 
closely, the only two significant exceptions being for the results 
from grades IX and XII, which were lumped together in com- 
puting the correlations. The correlations by themselves, in 


fact, mean very little. The increase in variability in mental 
ages from grade to grade need further inquiry and explanation. 
This will be considered in connection with the mental age 
distribution curves. 


2. Correlations with school marks 


The merits of attempting to determine the value of intelli- 
gence tests by correlations with school marks have often been 
discussed. It seems to be granted that it speaks well for an 
intelligence test if this correlation is high, but not too high! 
It was thought that they might be of some value in this study 
when taken in conjunction with the various other data. They 
are given in table 13. 

There are nine comparisons of the K.-A. tests with others, 
in the table. Four favor the K.-A. tests. The K.-A. correlate 
higher with school marks than do the Otis Primary, the Otis 
Self-Administering, and the Terman. For the Pintner-Cun- 
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ningham and the K.-A. tests the correlation is the same. For 
the four grades figuring in the comparison with the National, 
the National correlate higher in each instance. 

In looking for an explanation for the superiority of the Na- 
tional in this instance it may be noted, first, that one of the 
five tests in the battery consists of arithmetical problems, in- 
volving largely arithmetical information rather than arith- 
metical reasoning. This is one fifth of the battery. Secondly, 


TABLE 13 





| GRADE VI 

| GRADE Vil 

| GRADE VIII 
GRADES IX~XII 





GRADE It 
| GRADE III 
a lee | GRADE IV 
GRADE V 
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| 0.60) 0.61 
| 0.47) 0.51 
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these five tests were selected from twenty-two that were given 
a preliminary trial, and agreement of the preliminary results 
with teachers’ estimates of the intelligence (which would 
naturally be based largely on school records) were apparently 
made a basis for selecting the tests.* Thirdly, other studies 
have found results with the National to correlate relatively 


* Whipple, G. M. The National Intelligence Tests. Journ. Educat. 
Research, June, 1921. 
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high with school records.‘ This does not in itself prove any- 
thing about the relative merits of the K.-A. and National as 
intelligence tests, but indicates rather that the difference 
apparently in favor of the National in table 13 is very probably 
due to their measuring school achievement directly more. than 
the K.-A. tests. With this taken into account, the results of 
table 13 very strongly favor the K.-A. tests as a measure of 
intelligence. 


E. DISTRIBUTION OF SCORES 
1. Distribution of mental ages 


The resemblance of the distribution of the scores on mental 
tests to the normal distribution curve has frequently been used 
as evidence of the validity of the tests. Perhaps the following 
may be accepted as reasonable assumptions. 

1. Persistent correspondence of the distribution of scores to 
the normal distribution curve, when all known precautions 
have been taken to eliminate invalidating factors, indicates 
validity as a measure of intelligence. 

2. Persistent large variations from the normal distribution 
in any particular indicates a faulty construction of the tests and 
lack of validity. 

Table 14 gives the distributions of mental ages for each bat- 
tery of tests and each grade. They are given in half-year inter- 
vals for grades I to VIII, and in year intervals for grades IX to 
XII. It will be seen at once that the K.-A. test distributions 
are in nearly all instances closer approximations to a normal dis- 
tribution than are the corresponding distributions on the other 
tests. Exceptions are the Terman, and Otis, in grades IX to 
XII, and Otis in grades III and IV. Distributions varying 


* Garrison, 8. C., and Robinson, M.8. A Study of Retests. Journ. 
Educat. Research, March, 1925. This study reports a correlation of 
.55 between teachers’ judgment and the Standford Binet tests, as com- 
pared ‘with .76 between teachers’ judgments and the National. It 
found a correlation of .58 between educational test resuits and the 
Stanford Binet, as compared with .79 between educational test results 
and the National tests. 
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most from the normal by being too flat are the National in all 
grades and the Detroit in all grades, both tests having instances 
of extremely flat curves. The following graphs are based on the 
figures in the table, and are given as illustrations. The dotted 
line graph represents the K.-A. tests in each case. 

A few marked irregularities at either the lower or upper end of 
the distribution should also be noted. In the K.-A. tests for 
grade I there were many zero scores, and the lowest mental age 


28 
26 
24 
22 
20 
18 








Grapu 2 


on the scale is five years, no months. Children with several 
zero scores were given the lowest M.A. earned on any test, 
which in many instances would, of course be too high. Hence 
the markedly disproportionate numbers scoring between five 
and six. The scale does not go down low enough for the group 
tested. 

For the National in grades VII and VIII there is a marked 
rise in the number of cases at the upper end. This scale does 
not measure mental ages above 15-6. Consequently all cases 
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with this mental level and above fell into this one group, with- 
out discrimination. Forty-one per cent of all cases in VIII 
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get a mental age of 15-6. The scale is not extended up far 
enough for the seventh and eighth grade levels. 
This is equally true of the Otis Self-Administering, when 
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the mental ages are recorded according to the age norms given. 
The highest raw score age norm on this is 42, which is the 
average for age 17-6. If our results had been scored from the 
norm table, there would have been 60 cases in the highest 
mental age group, under 17-0 to 17-11. This is 35 per cent of 
all cases. Instead of using the norm table, however, the 
“‘IQ’s’”’ were determined directly from the raw scores, as 
directed by Otis, and the mental age was then computed from 
this ‘“IQ’’ and age. This procedure is not mentioned by 
Otis, but makes discrimination in mental age above 17-6 
possible. On the other hand, it is obvious that the “IQ” 
determined as directed by Otis is a decided misnomer, as it 
22 
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varies widely from the IQ resulting from dividing mental 
age by age. For example, a case age 15 with a raw score of 
27 would have a mental age of 12-9, and a true IQ of 85, 
but the “IQ” determined as directed by Otis would be 91. If 
the age were 12-9 and the mental age 15-0, his true IQ would 
be 118, while the Otis “IQ” would be 109.5 

When the distributions in table 14 are compared with the 
correlations in table 12 it is seen that on the whole a flat distribu- 
tion curve for a test battery goes with a high correlation between 
the scores on it and scores on the K.-A. tests. The Detroit 
and National have the flat distributions and correlate high with 
the K.-A. Of course, we have here the well known influence 


§ See his Manual of Directions, page 7. 
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of high variability of scores on raising a correlation. But the 
flat distributions in these instances are due unquestionably to 
undesirable traits of the test batteries, rather than to merits 
that might be shown by high correlations with other criteria. 


2. Causes of flat distribution curves 


The median mental age method of scoring the K.-A. tests 
prevents many extreme variations in mental ages, and this is 
one of the factors in the closer approach of its results to a normal 
distribution. A second factor is undoubtedly the smaller 
' variability of scores on the easy test as compared with the more 
: dificult one, which was noted above in connection with the 

variability in raw scores from lower to upper grades. We have 
f a decreasing variability in raw scores and an increasing varia- 
: bility in mental ages in passing from lower to higher grades. 
py This apparent contradiction disappears when we take into 
f account all the facts. A decreasing variability in raw scores 
as the tests become easier mean a smaller difference in scores 
t between bright and dull of any age or grade, and therefore, 
: also means a smaller increase in age norms from one age to the 
next. Consequently, a relatively slight variation in raw score 
for any child above or below the average for his age means 
a relatively large variation in mental age, consequently, also, 
the flatter mental age distribution curves for the tests consisting 
of many relatively easy trials, and the closer approach to a 
normal distribution for the mental ages as determined by the 
K.-A. tests. 








3. Effects on grade placement 


Our data permits showing the relative accuracy of grade 
placement according to the results of the different tests only 
in a small degree. Table 15 gives the percentage of children in 
each grade who fall above and below the average mental age 
of the grade by asmuch asa yearormore. The average mental 
age is the average as found by the tests in question in each 
i instance. For example, in grade I 39 per cent of the children 
i had a mental age one year or more above or below the P.-C. 
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average as found by the P.-C. tests. Twenty-seven per cent 
had a mental age one year or more above or below the K.-A. 
average as found by the K.-A. tests. 

There is only one instance in this table in which the K.-A. 
tests do not show less misplacement of children in the grade 
than the other tests with which they are compared. This is 
in grade IV, where the Otis tests show one per cent less mis- 
placement. The figures in this table follow, of course, neces- 
sarily from the facts given in the preceding section. 
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F,. AGE NORMS 


Our results cannot, of course, furnish an adequate basis for 
determining the accuracy of the age norms for any of the tests 
used, since the study was limited to two towns, and all the 
school children were tested in only one of them. The question 
as to whether the average scores for any test battery that are 
taken as true average for successive ages of non-selected children 
are exactly right is not of primary importance. A general 
tendency for tests to give IQ’s a few points too high or too low 
is not a serious matter. Moreover, the entirely “‘non-selected’’ 
group of children at any one age is a matter of pure imagination. 
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Such a group can never be realized even when the matter of 
possible race differences is left out of account. 


1. Average intelligence quotients for different grades 


In all grades except the first the average IQ was over 100 for 
every one of the test batteries used. Assuming that there is 
some tendency for children to progress through the grades 
according to ability instead of entirely by age, this is as it should 
be. However, there is no general increase in the average IQ 
as we pass from lower to higher grades. A more important 
aspect of the results is the fact that the average IQ for a grade 














TABLE 16 
q 
4 22 
$14] |] ad] es] ed] alt] esl a] 
Ps “ 6 y} z i E ai ola] if 
I |} 97) 98 
II | 105 | 108 
Ill 109 | 104 
IV 105 | 103 
V 104 | 102 
VI 105 | 103 
VII 107 | 106 
VIII 107 | 101 
IX-XII 103 | 103} 105) 100) 101 



































is in almost every instance a little lower for the K.-A. tests 
than for the other batteries. Table 16 gives the results. 

The reason for the K.-A. tests giving average IQ’s for the 
different grades that are nearer 100 than do the other tests is 
not clear. Other Minnesota towns in which different grades 
were completely surveyed with the K.-A. tests have consistently 
given the same results as shown in table 16. We might assume 
that Minnesota children are on the whole a little brighter than 
were the children in other states where age norms for the other 
test batteries were secured. If so, the Minnesota children in 
this study would, of course, score relatively high on the other 
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tests, and relatively low on the Kuhlmann-Anderson tests 
whose age norms were secured from Minnesota children. We 
have no evidence from any other sources to indicate that this 
assumption is correct. On the other hand, there was a signifi- 
cant difference in procedure in securing age norms for the K.-A. 
tests from the procedure for the other tests, I beleive. The 
K.-A. tests were always given by extensively trained and 
thoroughly experienced mental testers. We have had repeated 
occasions to observe the influence of the group tester on the 
results obtained. The K.-A. test age norms may be relatively 
high because the testers got the children’s best responses. In 
this study both the K.-A. tests and the others were given by 
highly qualified testers. 


2. Scoring above the level of the highest age norm 


The highest age norm for the K.-A. tests is forage 18-0. For 
the National it is 15-6, for the Terman 19-6, and for the Otis 
Self-Administering it is 16-9. Theoretically, the highest age 
norm for any tests should be the age at which the tests cease 
to give higher scores for average or non-selected subjects above 
thisage. That point would vary with different tests, unless the 
tests were equally capable of measuring the small yearly inere- 
ments in mental development at these highest levels. Since 
no one has ever had non-selected subjects at these levels, this 
point has never been determined for any tests, and the highest 
age norm for any tests is always more or less arbitrarily chosen. 
All the tests used in the upper grades in this study give some 
increases in the average score for the different grades, up to 
and including grade XII, but they differ considerably in the 
size of these increments. To make the highest age norm at 
an age lower than where these yearly increments, as found, 
cease is probably an attempt to guess at the age where mental 
development as measured by the tests ceases, on the whole. 

The Otis Self-Administering and the K.-A. tests make some 
provision for IQ scoring when raw scores are above the highest 
age norm. This prevents the lumping of IQ’s and especially 
of mental ages at the upper end of the distribution, and also 
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prevents misleading averages. If mental age scores cannot 
go above sixteen, for example, many High School pupils and 
quite a number in the eighth grade will be scored too low. In 
computing intelligence quotients in this study the highest age 
used for divisor was the age corresponding to the highest age 
norm for the tests in question, except for the Terman tests, 
since Terman’s instructions put the maximum divisor at sixteen. 
Terman’s instruction results, of course,in a higher IQ for all 
pupils whose age is over sixteen. The average mental age of 
the pupils given the Terman and K.-A. tests is lower for the 
Terman tests, but the average IQ is higher for the Terman 
for this reason. 


G. SUMMARY AND DISCUSSION 


1. The Kuhlmann-Anderson tests were compared with seven 
others in a testing program as follows: (a) Pintner-Cunningham 
in grades I and II; (6) Otis Primary, Form A, in grades III and 
IV; (c) Detroit Primary, Form C, in grades II, III, and IV; 
(d) National, Seale A, in grades V, VI, VII, and VIII; (e) 
Detroit Alpha, Form M, in grades V, VI, VII, and VIII; (f) 
Otis Self-Administering in grades IX, X, XI, and XII; (g) 
Terman Group in grades IX, X, XI, and XII. Wherever one 
of the other tests was given the Kuhlmann-Anderson tests 
were also given immediately before or after. The Detroit 
tests were given only to selected classes in the grades mentioned, 
which were in a second town. All the others were given in the 
first town, where all classes in all grades were tested. The 
number of children figuring in any comparison in any grade 
varied from 89 to 221. 

2. In general, the scores on a single test in any battery 
correlated higher with the total scores for the battery the 
more difficult the test in question was. While the other tests 
become easier for the different grades as we pass from lower 
to higher because the same battery is used in several successive 
grades, the K.-A. tests remain more nearly of the same difficulty 
for successive grades by changing to a different battery for each 
grade. 
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3. The variability in raw scores, (42), for the single test 


on a battery decreases markedly as the test becomes easier. 
To this there are no exceptions in going from a lower to a higher 
grade for any of the tests, or in comparing the first, easy, half 
of a K.-A. battery with its second, more difficult half. This 
high variability is therefore associated with high correlation 
of single test scores with the total scores for the battery. 

The variability is higher for the K.-A. tests in seven out of 
nine comparisons made between averages. 

The explanation suggested for this decrease in variability in 
raw scores on the separate test, while the variability in mental 
ages increases in going from lower to higher grades is as follows: 

(a) All tests are likely to have certain mechanical limitations 
in speed of performance, because of a minimum required time 
in going through the mental and motor steps involved in the 
task set. As tests are applied in higher and higherschool grades 
these mechanical limitations are approached, and the dull and 
bright will then make more nearly the same score. 

(b) Asa test becomes easier there is a corresponding tendency 
for effort to decrease. This also tends to make the scores for 
dull and bright more alike as the test is applied in higher grades. 

Increase in variability in the mental ages in going from lower 
to higher grades occurs because for the higher grades the yearly 
increase in age norms is much smaller, so that a slight variation 
in raw score gives a relatively quite large variation in mental 
age. 

4. Zero and maximum scores on single tests in a battery are 
in most instances quite successfully avoided in all the other 
batteries by a large number of trials in each test beginning 
with very easy trials and by short time allowed fora test. This 
is especially true of both the Detroit batteries used, and of 
the National, in less degree. This, however, makes the indi- 
vidual trials relatively easy and tends to reduce the test to a 
“speed test.’”’ The K.-A. tests give more zero and maximum 
scores in at least the first three grades. The median mental 
age method of scoring used eliminates the error resulting from 
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zero and maximum scores, except perhaps when the mental 
level of the group falls much below seven years. In the present 
study the average mental age in the first grade was very near 
six, with a number going below mental age 5-0, the lowest 
possible score on the K.-A. tests. The P.-C. tests were better 
adapted for this group than the K.-A. 

5. Since in the K.-A. tests the raw score in each single test 
in the battery is transmuted into a mental age, the raw scores 
are weighted in accordance with the average abilities of children 
of different ages. In all the other tests the raw scores are 
unweighted, except in some instances where the number of 
trials passed is multiplied or divided by a certain figure so as 
to on the whole make the test count roughly the same as the 
others in the battery. In the present results some of the tests 
have on the average contributed three times as much to the total 
score for a battery than have other tests in the same battery, 
while contributing twice as much was quite common. For 
many individual children the difference in what different tests 
in the battery have contributed to the child’s total score was, 
of course, much greater. 

6. The median mental age method of scoring in the K.-A. 
tests has the advantage of avoiding the undue influence of 
exceptionally low or high scores on a test or two in a battery 
on the mental age a child earns. This undue influence is 
probably one of the most important causes of earned mental 
ages which do not represent the true general ability of the child 
tested. 

7. The capacity for a test to discriminate between the abilities 
of successive age levels is indicated somewhat by the yearly 
increase in age norms from lower to higher ages. The per- 
centage of yearly increase in average raw scores from one year 
to the next is higher for the K.-A. in practically all instances. 
Table 17 gives these percentages of increase for approximately 
the ages that would .correspond to the grades in which the 
respective tests were given in this study. 

8. The relative amount of overlapping in scores between two 
groups known to be different for any two tests is usually some 
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indication of the relative validity of the two tests. In nine out 
of eleven comparisons the K.-A. give less overlapping in mental 
ages between two successive grades. 

9. The standard deviation or other measure of variability 
is sometimes favored as an indication of reliability, that test 
being regarded as most reliable which shows the smallest varia- 
bility in the scores of a group. In fifteen out of seventeen 
comparisons the 8.D. of the mental ages was smaller for the 
K.-A. tests. 
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10. Correlations were computed between the mental ages on 
the K.-A. tests and the others, separately for each grade, 
except grades [IX—XII. On the whole these are highest between 
the Detroit Alpha and the K.-A. in grades V—VIII, between 
the National and the K.-A. for grades V—-VIII, and between 
the Terman and the K.-A. in grades [IX—XII, where they range 
from 0.64 to 0.80. Differences in these correlations follow 
closely the differences in variability as measured by the S.D. 
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and this is regarded as the true explanation of the correlation 
differences found. High variability in mental age is due to the 
tests being made up of a large number of relatively easy trials 
and the relatively small yearly increase in age norms. The 
poorer tests have given the higher correlations with the K.-A. 
According to the age norms, the National are best adapted 
near grade V, becoming less so for the higher grades, and the 
correlations with the K.-A. decrease consistently from V 
to VIII. 

11. Correlations were computed between the mental ages and 
the average school marks for the preceding year. This was 
done for all tests given in Town A where all children were 
tested, and for each grade separately, excepting that grades I 
and IX were omitted, and the results for grades X to XII were 
lumped together. These correlations are higher for the K.-A. 
tests in four out of nine comparisons, and lower in four others. 
The four comparisons in which the correlation with school marks 
is lower for the K.-A. tests are all with the National, grades V to 
VIII. It is noted that one of the five tests in the National 
battery consists of arithmetical problems, that the five tests 
were selected from twenty-two partly on the basis of agreement 
of results with teachers’ estimates of intelligence, and that the 
National correlate higher with educational test results than 
do the Stanford Binet. 

12. The distribution of mental ages for a grade approaches 
the normal distribution more closely for the K.-A. in nearly all 
instances. The National and both Detroit batteries give 
especially flat distributions. The explanation given is the 
median mental age method of scoring the K.-A. tests, and the 
too easy trials that the other tests are made up of for the grades 
in question. 

13. Flatter distributions in mental ages in a grade mean more 
misplacements in the grade of individual children. In fourteen 
out of fifteen comparisons with other tests the K.-A. show less 
misplacement of children in the grades. 

14. In all grades except the first the average IQ for the 
grades was over 1.00 for all tests. In nearly all comparisons 
the average IQ is nearer 1.00 for the K.-A. tests. 
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15. The highest age norm for the K.-A. battery is for age 
18-0, for the National it is 15-6, for the Terman 19-6, and for 
the Otis Self-Administering it is 16-9. The K.-A. and the 
Otis make provision for the IQ scoring when the raw score is 
above that of the highest age norm. The National and Terman 
do not. This results in some bunching of the highest scores in 
grades [X to XII with the Terman tests, and in very marked 
bunching of the highest scores in grade VIII with the National 
tests. Forty-one per cent score a mental age of 15-6 on the 
National in grade VIII. 

16. Nosecond or “B”’ form of the K.-A. tests has been worked 
out, and none is planned. This is for the reason that a second 
form has but small practical usefulness. A second form of the 
same tests is very rarely given immediately after the first for 
other than research purposes. When an interval of a year 
or more follows the first test, a second form of the same tests is 
no longer adequate, and the K.-A. tests with a different battery 
to fit the now older children provides what is needed better than 
does a second form of the first. 

The K.-A. tests have not been given twice to the same children 
in more or less immediate succession for the purpose of finding 
reliability coefficients through correlations between results of 
the first and second giving of the tests. The writer has not 
found such coefficients very useful in judging the merits of 
tests. Apparently any tests quite consistently give about the 
same score on repeated application. The trouble lies in the 
fact that they are as consistent in making the same mistakes 
on the same child as they are in giving a correct score, meaning 
by correct score one that represents general mental ability. 
Whether or not this is a matter of ‘‘validity’’ depends on how 
we define validity and “general mental ability.”’ Any particu- 
lar test in a battery may measure quite accurately some mental 
trait that it is intended to measure and also do this consistently, 
but the trait in question may be so under- or over-developed 
in a given child that the score on the test in question will 
entirely distort his total score on the battery. This is one of 
the greatest difficulties to overcome in all mental tests. 
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17. The mental test movement since Binet has been essen- 
tially a development of statistical technique as applied to mental 
tests. Out of this has come a widespread acceptance, on the 
part of lay readers and users of mental tests, of the coefficients 
of validity and reliability as the last and only thing needed for 
gauging the value and usefulness of mental tests. The writer’s 
view has been for some time that these coefficients, when un- 
accompanied by a most thoroughgoing analysis of all conditions 
and circumstances, are all but valueless. It is hoped that this 
study has furnished at least some of the reasons why, beyond 
the usual pitfalls in their use always pointed out by statisticians. 

The unusual interest in developing and applying statistical 
technique has all but crowded out experimental observation. 
We have volumes on method, and next to nothing, in compari- 
son, on controlled observation on tests, both discarded and 
retained. Most of the literature on mental test scales, in 
fact, gives little indication that there have been any that were 
tried out and discarded. The writer and co-author of the K.-A. 
tests have been equally guilty in this respect. At least, we 
also have not published the seventy-five or so tests tried out 
and not included in the final scale, together with the accumu- 
lated data and observations we had on them. Such things are 
impossible without interested readers or financial burdens on 
the producers. Psychology should be due for a return to 
psychology in mental tests, and statistical treatment of psycho- 
logical observations should get out of the rut of the Pearson r. 
It would be well to recall that correlation technique played no 
noticeable part in giving us the Binet scale, but that it was 
preceded by years of painstaking experiment and observation 
on children. 








EFFECTS OF FATIGUE ON THE DISTRIBUTION OF 
ATTENTION 


H. F. VERWOERD 
University of Stellenbosch, South Africa 


If an O must perform a task requiring a sustained distribu- 
tion of attention for some time, fatigue sets in. This fatigue 
can manifest itself in several ways as the following investigation 
shows. 

The O’s, 15 in number, had to react 15 to 25 minutes on the 
“Attention- and Fatigue-meter” of Piorkowski after it had 
been somewhat altered.1. Whenever a white strip appeared in 
any one of five openings to the left of a white mark, they had 
to react on one button, but on another whenever a white strip 
appeared in any one of five openings to the right of the mark. 
The correct reactions were registered separately for each of the 
ten openings by means of ten electric counters, and the total 
number of possible reactions by means of a further counter. 
The percentages of correct reactions for each opening could 
then be calculated and taken as an indication of the distribution 
of attention.' Readings were taken after the appearance of 
200 strips, i.e., every 14 minutes, by switching off the counters 
for about half a minute. The O had, however, to continue 
without resting and could therefore not be allowed to notice 
that the reactions were not being registered during these 
short intervals. Hence a sound proof partition between the O 


1 The apparatus and the alterations, with the reasons for believing 
that a distribution of attention is really present during experimentation, 
were described fully in a former article on ‘‘The Distribution of Atten- 
tion and its Testing’”’ in Journal of Applied Psychology, vol. 12, no. 5, 
1928. 
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and the E (with the counters) was found desirable.2 Some 
of the changes in the performance scores during the later periods 
could be interpreted as effects of fatigue on the distribution of 
attention. 

Various such effects of fatigue could be distinguished. 1. 
Sometimes the performance as a whole suffered. This happened 
when there was a decrease in the total scores during the later 
periods (see table 1). Sometimes when the general total 
decreased there could also be noticed a change in the evenness 


TABLE 1 
Observer 38 


The number of correct reactions for each opening and the total scores 
(T.S.) are expressed as percentages of possible reactions. 
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or the unevenness of the distribution of attention, the latter 
becoming more uneven than before (table 1). There were other 
cases however where the evenness or the degree of unevenness 
remained the same. This is important since it means that from 
the determination of the influence of fatigue on the total height 
of a performance where a distribution of attention was neces- 


* Two sets of counters, to be switched on alternately, would have 
been more satisfactory in as far as a complete record of the O’s perform- 
ance would have been obtained and no short intervals missed out. So 
many counters were however not obtainable. 
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sary, one may not deduce that the evenness of the distribution 
had also diminished. The distribution may have been quite 
as even as before but only on a lower general level. 

2. An increased unevenness in the distribution of attention, 
although the total height of performance remained at least the 
same, was alsofound. In some instances where the total scores 
of the performance remained the same or even increased during 
the later periods, the standard deviations of the percentages of 
correct reactions at the ten openings for each period also grad- 
ually increased (see table 2). This means, if the standard 
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Observer 37 
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deviations can be taken as an indication of the evenness or 
degree of unevenness of distribution,’ that the distribution be- 
came more and more uneven as time passed. Such an increased 
unevenness is looked upon as an effect of fatigue on the dis- 
tribution of attention. 

3. Furthermore there were instances of a narrowing of the 
field over which the attention was distributed without the 
general height of performance decreasing. It sometimes hap- 
pened that stimuli from either one or both of the outer portions 


3 See the article referred to in previous footnote. 
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of the field or from the central portion were later noticed less 
than at first. The general total of course decreased where the 
stimuli from the remaining portions of the field were then 
noticed to the same extent as before. In some instances, 
however, the number of reactions to stimuli from these remain- 
ing portions increased when attention was focused on them more 
than before. The total scores thus sometimes remained the 
same or even increased and hence did not show any effects of 
fatigue, although there were very definite signs of it in the 


TABLE 3 
Observer 33 


The percentages of correct reactions, which are fairly low compared 
to those at other openings, are italicized. 
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narrowing of the field over which the attention was distributed. 
Such narrowing can be observed when the scores for each of the 
openings during the various periods are compared (see table 3). 

The narrowing of the field over which the distribution of 
attention takes place without the total height of performance 
decreasing, could be looked upon as a special type of increase in 
unevenness of distribution. Emphasis is however here laid on 
the fact that what can be seen of the effect of fatigue in such 
instances is not simply the increase in unevenness of distribution 
but also a certain limiting of the field of attention. 
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The individual differences between O’s with respect to their 
performance for some time of such a task requiring a distribu- 
tion of attention, were not confined to these various ways of - 
being subject to fatigue. Very often no fatigue effects could be 
observed, and in one instance there was an improvement both 
as to general total aud evenness of distribution (see table 4). 

A few words may now be added on the method used and the 
possibility of its employment as a test for fatigability in per- 
forming a task requiring a distribution of attention. It is 


TABLE 4 
Observer 32 
Relatively low percentages are italicized 
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evident that one cannot make use of only two counters, as is 
usually done, viz., one to register the total number of possible 
reactions and the other the total number of correct reactions. 
A decrease in the latter total during the later periods may show 
that fatigue has set in but, as was shown above, does not allow 
any deductions as to the influence of this fatigue on the evenness 
of the distribution of the attention. Such a mistake was how- 
ever presumably often made, perhaps unrecognized, by those 
who made use of this method, since they used it to measure 
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both the distribution of attention and the influence of fatigue. 
Furthermore it is impossible to decide on the presence or absence 
of an influence of fatigue if only the total numbers of correct 
reactions are obtainable and if these remain constant or increase 
during the later periods. Signs of fatigue, such as the narrow- 
ing of the field over which the attention is distributed or an 
increase in unevenness of distribution, would be overlooked if 
the absence of fatigue is assumed in such cases. Yet this 
assumption must have been made inasmuch as the decrease in 
the total scores was alone taken as an indication of their fatiga- 
bility in differentiating between O’s. Consequently the above- 
mentioned signs must have been overlooked and the 
differentiation between O’s as to their fatigability made on a 
false basis. 

The method, even as altered and found useful for this investi- 
gation, cannot however be recommended as a test for fatiga- 
bility in the performance of tasks requiring a distribution of 
attention. An important reason is that far too much calcula- 
tion work is necessary, in determining, e.g., the large numbers 
of percentages of correct reactions and the standard deviations 
for each period, to be practicable for routine testing. Apart 
from this, however, there are more fundamental objections. 
Only in a minority of instances can such fatigue effects as were 
described be ascertained. The others yield tables of figures 
which are only confusing and lend themselves to no definite 
interpretations. Even if this had not been the case, a final 
difficulty would have arisen, namely, to compare O’s with regard 
to their fatigability where sustained distribution of attention 
is necessary, e.g., for the purpose of vocational selection, on 
the basis of at least three effects of fatigue which are different 
not in degree but in kind. 


SUMMARY 


Three different kinds of effects of fatigue on the distribution 
of attention were described, namely, where the performance 
as a whole suffers, the distribution of attention sometimes be- 
coming less even than before and sometimes remaining quite 





EFFECTS OF FATIGUE ON ATTENTION 601 


as even but only on a lower general level; where there is an 
increased unevenness in the distribution of attention although 
the total height of performance remains the same or even 
increases; where the field over which the attention is distributed 
becomes narrowed without the general height of performance 
decreasing. It is also shown that Piorkowski’s ‘‘Attention- 
and Fatigue-meter” is not suitable for testing fatigability 
where sustained distribution of attention is necessary, even if 
the apparatus is so altered that a distribution of attention is 
really obtained and that the evenness of such distribution can 
be checked. 


. 











CAN STUDENTS DISCRIMINATE TRAITS ASSOCI- 
ATED WITH SUCCESS IN TEACHING? 


J. M. STALNAKER anv H. H. REMMERS 


Purdue University 


I, RELATIVE IMPORTANCE OF THE VARIOUS TRAITS 


“The college exists because society desires that youth be 
taught,” said Ernest Hatch Wilkins in his inaugural address as 
president of Oberlin College. ‘Teaching, then, is the thing 
primarily expected of the college Teaching is, in the 
last analysis, the function of the college. The quality of the 
teaching is the measure of the success of the college.”’ 

That this is true practically no one denies. But upon the 
questions of how one shall know a good teacher, what traits 
go to make up a good teacher, and how one measures the quality 
of teaching, there is no such uniformity of opinion. 

Perhaps, at the outset, agreement may be obtained on one 
pertinent fact: one element in the teaching situation is the 
student’s reaction to the teacher. Does a student believe the 
teacher is competent, interesting, sympathetic, well-balanced, 
and so on? Questions as to the relative value of this student 
reaction are numerous and varied, but all admit that it has 
some weight. 

The Purdue Rating Scale for Instructors' was developed to 
measure in an objective way the student opinion of the ability 
of an instructor for his task. All the traits measured are ones 
which an instructor may with effort alter. Therefore the scale 
presents an instrument which may be used for the improvement 
of teaching. It is a graphic scale consisting of ten items or 
traits upon which an instructor is judged by his students. The 
qualities rated are: (1) Interest in Subject, (2) Sympathetic 


1 By G. C. Brandenburg, and H. H. Remmers. Published by the 
Lafayette Printing Company, Lafayette, Indiana. 
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Attitude towards Students, (3) Fairness in Grading, (4) Lib- 
eral and Progressive Attitude, (5) Presentation of Subject 
Matter, (6) Sense of Proportion and Humor, (7) Self-reliance 
and Confidence, (8) Personal Peculiarities, (9) Personal Ap- 
pearance, (10) Stimulating Intellectual Curiosity. Each of 
these traits has three guiding phrases listed below a graphic 
scale of 100 divisions. For example under “Interest in Sub- 
ject,’”’ these three suggestive phrases appear: “Always appears 
full of his subject, Seems mildly interested, Subject seems irk- 
some to him.” The directions on the scale instruct the student 
in rating the instructor to “make a check on the line at the 
point which most nearly describes him with reference to the 
quality you are considering.”” The ratings of course are anony- 
mous. Their purpose is to give the teacher interested in self- 
improvement an opportunity to get an objective check on the 
student opinion of his ability as an instructor. 

The present paper presents data to show (1) the extent to 
which students agree as to the relative importance of the ten 
traits of the scale and (2) the extent to which they discriminate 
among the various traits when rating a given instructor, i.e., 
the extent to which the Scale may be said to be free of the 
“halo effect.” 

Validity, reliability, and absence of halo effect are all essen- 
tial characteristics of any measuring instrument such as the 
Purdue Rating Scale for Instructors. One objection to it that 
is frequently met is that the ten traits differ extremely in their 
relative importance in the teaching situation. How, for ex- 
ample, can “Stimulating Intellectual Curosity’’ be compared 
to “Personal Peculiarities’? Is not the one incomparably more 
significant than the other? While it is not possible as yet to 
answer this objection in absolute terms, it is at least a fairly 
simple task to obtain the relative importance of the various 
traits as measured by student judgments. To obtain this rela- 
tive importance together with the reliability of the judgments 
was the purpose of the investigation reported in this section of 
our study.? 


2 We are indebted to Mr. L. O. Hopkins for the collection and tabu- 
lation of the data. 
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Each of the traits was printed on a set of cards and each 
student was asked to arrange the traits in the order of their 
importance. A sufficient number of these cards was provided 


TABLE 1 
Student evaluation of the relative importance of the traits on the Purdue 
rating scale for instructors 
One class of 26 students is divided into random halves 
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Interest in subject 94 89 
Sympathetic attitude toward students 67 55 
Fairness in grading 58 76 
Liberal and progressive attitude 84 
Presentation of subject matter 
Sense of proportion and humor 61 54 
Self-reliance and confidence 75 87 
Personal peculiarities 25 30 
Personal appearance 42 32 
Stimulating intellectual curiosity...........| 106 
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TABLE 2 


The rank order correlation of three classes in judging the relative order of 
importance of the ten traits 











CORRELATION FOR 
CHANCE HALVES 


CORRELATION FOR 
ENTIRE CLA8s* 


NUMBER OF 
STUDENTS IN CLASS 





0.90 
0.85 
0.91 








0.95 
0.92 
0.95 


26 
30 
31 








* These values are obtained by using the Spearman-Brown prophecy 


formula. 


to supply each student of a class with a set. 
had been arranged in the order of merit, the students were asked 
to turn them over and to mark them from ten to one, the most 
important being ten, the least important, one. 


When the cards 


The data for the 
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class were then divided into two random halves and the totals 
for each trait obtained. Table 1 will make clear the results for 
a class of 26 students. 

Data were collected from three different classes (chiefly 
sophomores in the Schools of Engineering and Home Economics 
at Purdue University). The correlations for the rank order of 
traits were computed by the rank difference formula, 


6 = d 
rho = 1 — Wu — 1) 
and the obtained values “stepped up” by means of the Spear- 
man-Brown prophecy formula.* The values for random 
halves of each class and the corresponding values “‘stepped up”’ 
—i.e., the correlation or “reliability” for the whole class—are 
tabulated in table 2. 

It is evident that these students agreed to a remarkable ex- 
tent on the relative importance of the ten traits. If the median 
value 0.90 for the chance halves be taken as typical, then the 
correlation for 100 students becomes approximately 0.97. 

Table 1 apart from showing the community of opinion of the 
students is perhaps even more interesting as a statement of the 
relative values attached to the various traits. The three 
highest are Presentation of Subject Matter, Stimulating In- 
tellectual Curosity, and Interest in Subject Matter. Those 
rated lowest are Personal Peculiarities, Personal Appearance, 
and Sense of Proportion and Humor. The writers hazard the 
guess that a representative group of faculty members would 
not differ materially from these students in the relative weights 


* That this formula applies to situations similar to the one here de- 
scribed has been shown by one of the writers. See Remmers, H. H., 
Shock, N. W., and Kelly, E. L., An Empirical Study of the Spearman- 
Brown Prophecy Formula, etc. Jour. Educ. Psych., March, 1927, pp. 
187-195. Others who have reported investigations of this formula for 
various sorts of data are Holzinger, Jour. Educ. Psych., xiv; Holzinger 
and Clayton, ibid., xvi; Kelley, ibid., xv; Ruch, Ackerson and Jackson, 
ibid., xvii; Furfey, ibid., xvii. For the formula itself see Kelley, T. L., 
Statistical Method, p. 205. 
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of the various traits. It is interesting to note in passing that 
the lack of agreement for “Fairness in Grading’”’ appeared in 
each of the three classes, the rank differences being respectively 
3, 4, and 3. 


II. DO STUDENTS DISCRIMINATE AMONG THE VARIOUS TRAITS FOR 
A PARTICULAR INSTRUCTOR? 


The scale has been found to be both valid and reliable; that is, 
it actually measures what it purports to measure—the students’ 
opinion of the instructor’s qualities—and it measures these 
traits accurately.‘ No complete study of the “halo effect” 
of this scale has, however, been attempted up to this time. 

The halo effect, a very common source of error in rating 
scales, is “the tendency to allow the general impression of the 
individual to color very markedly the evaluation of specific 
traits. If a man impresses us favorably either in a general way 
or by virtue of some particular aspect of personality, or perhaps 
by some happy incident in our contact with him on the golf 
links, we are prone to invest his personality with a halo which 
sheds a luster upon his various traits and leads us to over- 
estimate the desirable and to underestimate the undesirable 


- in his personality.’ 


The halo effect in the teacher rating scale would mean that a 
student who likes a teacher for any reason whatsoever, there- 
fore rates him high in all traits, even those in which he actually 
is deficient. In the scale herein described an effort has been 
made to overcome this effect by the use of a graphic scale, by 
dividing the rating up into various well defined traits, by the 
use of guiding phrases beneath the scale, and by the careful 
selection of traits which are not thus affected. 

If the halo effect, however, persists in spite of these precau- 
tions, it will evidence itself in the inter-correlations of the vari- 
ous traits. If there is a halo effect then the students will tend 


*Remmers, H. H., and Brandenburg, G. C. ‘Experimental Data 
on the Purdue Rating Scale for Instructors.’’ Educational Adminis. 
tration and Supervision, November, 1927. 

* Burt, Harold E. Employment Psychology, New York, 1926, p. 368. 








TRAITS FOR SUCCESS IN TEACHING 607 


to rate the instructor according to a set pattern in all the ten 
traits. If trait 1, say, is correlated with trait 2, and a high 
positive correlation or relationship is found, a halo effect is 


TABLE 3 
The standard deviations of the distribulion of 94 students’ ratings 
on the ten traits 





TRAIT 


¢ 





























TABLE 4 
The intercorrelations of all combinations of the ten traits, taken two at a 
time, on the Purdue rating scale for instructors 





1 


9 
~ 





0.286 
0.335 
0.355 
0.379 
0.449 
0.715 
0.462 
0.408 
0.491 


1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


_ 


0.286 


0.358 
0.304 
0.364 
0.412 
0.531 
0.417 
0.210 
0.301 








Average (0.431 





0.354 


3 4 


5 6 





0.335)0.355 
0.358/0.304 
0.331 
0.331 
0.444/0.244 
0. 416)/0. 428 
0.288)0.381 
0.434/0.081 
0.2650 .227 
0.420)0.259 








0. 366)0.290 


0.379)0. 449 
0.364/0.412 
0.4440. 416 
0.2440. 428 
0.446 
0.446 
0.429/0.118 
0.534)0.155 
0.550/0.066* 
0.533)0.023 





0. 436/0 .265 





7 
0.715 
0.531 
0.288 
0.381 
0.429 
0.118 


0.526 
0.383 
0.398 


0.419 








8 9 10 
0.462/0.408 |0.491 
0.417/0.210 |0.301 
0.43410.265 (0.420 
0.081\0.227 |0.259 
0.534/0.550 |0.533 
0.155)0.066*/0 .023 
0.526)0.383 |0.398 
0.564 |0.477 
0.564 0.442 
0.477/0.442 


0.406)0.331 (0.365 











Average of all correlations, 0.366. 

* The correlation coefficients so marked are negative. 

Note: The correlation coefficients are based on 94 cases. All ratings 
were secured from three divisions of the same subject taught by the 


same instructor. 


indicated. Just how much positive correlation should exist 
between traits 1 and 2—above just exactly what value the 
halo effect is indicated—is, of course, an unknown quantity. 
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A more thorough study of this halo effect than was previously 
made by the authors of the Scale® has been undertaken and is 
here reported for one instructor’s ratings by 94 students in three 
divisions of the same course. 

Inasmuch as the correlation coefficient is very decidedly 
affected by the size of the standard deviation of the distribution 
—in this case a measure of the degree to which the students 
agree on any particular rating for the instructor—it is pertinent 
to give the standard deviations for the instructor herein con- 
sidered. This is shown in table 3. The ratings for this in- 
structor were approximately average. 

Table 4 gives the correlation coefficients obtained by corre- 
lating each trait with every other one. Ten traits taken two 
at a time give 45 different combinations. These correlations 
were computed in part by the Hull Correlating Machine (2X, 
DX? and TXY). 

The coefficients presented in this table show considerable 
variation in size, although in general they are low. No doubt 
their exact size is largely determined by the qualities of the in- 
structor being rated. A second study similar to this one, car- 
ried out on another instructor of radically different teaching 
methods, might give entirely different results. The results 
here obtained, however, tend to indicate quite clearly that 
there is no definite or pronounced halo effect, the average of 
all the coefficients being only 0.366. 


Ill, EFFECT OF ERRORS OF MEASUREMENT 


The numerical value of any correlation is reduced by the 
errors of measurement present. Theoretically this error can 
be eliminated by correcting any given coefficient of correlation 
for attenuation.’ This correction is particularly important 
for this study, since absence of halo effect can be argued only 
to the extent that the various corrected coefficients are sig- 


* See Manual for the Purdue Rating Scale for Instructors. 
7 Kelley, T. L., loc. cit., p. 204. 








TRAITS FOR SUCCESS IN TEACHING 609 


nificantly less than unity. The amount of error is a function 
of the reliability, as appears from the formula used: 


fr 


12 
ee 
where r,, is the reliability of one measure and r.,, the reliability 
of the second. It will be pertinent therefore, to give the relia- 
bilities for the various traits. They are listed in table 5. 

It was unfortunately impossible to determine the reliabilities 
of the ratings for the instructor whose ratings provide the inter- 
correlations of table 4. Such reliabilities were available, how- 


TABLE 5 
The reliability of the ten traits based on 94 student judgments 





TRAIT 


RELIABILITY 


TRAIT 


RELIABILITY 


TRAIT 


RELIABILITY 





1 
2 
3 


0.879 
0.852 
0.778 


0.869 
0.733 
0.798 


8 
9 
10 


0.915 
0.965 
0.824 


0.873 


























ever, for an instructor teaching the same subject to the same 
kind of students. The average ratings of this instructor were 
very similar to those for whom the trait intercorrelations were 
calculated. Since these intercorrelations were based on 94 
students, while the reliability coefficients used were based on 30, 
these reliability coefficients were stepped up by the Spearman- 
Brown prophecy formula.* It is these derived coefficients 
that are listed above. 

Table 6 gives the intercorrelations of table 4 when corrected 
for attenuation. It will be observed that the average inter- 
correlation of all traits rises from 0.366 for the uncorrected 
values to 0.435 for the corrected values. 

No doubt these various intercorrelations would fluctuate 


* See Remmers, Shock, and Kelly, loc. cit. for justification of this 
procedure. 
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considerably with further sampling. The P.E., when r = 
0.40 and N = 100 is 0.0567. The obtained average correlation, 
however, is certainly a quite stable measure, and fully warrants 
the conclusion that the Purdue Rating Scale for Instructors is 
gratifyingly free from halo effect, since no doubt any “objective”’ 
measurement of the traits (if such were available) would show 
positive intercorrelations. It is further evident that none of the 
traits is superfluous, for the columnar averages are all rela- 
tively low; indeed, only one of the forty-five r’s ( 1 vs. 7) has a 
value approaching unity. 
TABLE 6 
The coefficients of table 4 corrected for attenuation 














enart 4 | 5 6 | 7 | 8 9 10 
1 0.406\0.472| 0.535)0.81710.515| 0.443/0.577 
2 .353/0.461| 0.500/0.617/0.472| 0.232/0.360 
3 .403/0.588| 0.528)0.350/0.514) 0.306/0.525 
4 0.514/0.438/0.091) 0.248/0.306 
5 306 0.584/0.537/0.651| 0.654/0.686 
6 .514 .142/0.181| —0.075|0.028 
7 .438 .589| 0.41710.469 
8 .091 .589 0.599/0.549 
9 .248 .417|0.599 10.496 
10 .306)/0. .469/0.549) 0.496 

Average 0.341/0. 0.326/0.486'0.462| 0.369\0.444 



































Average of all correlations, 0.435. 


IV. CONCLUSIONS 


1. Students show a high degree of agreement in their judg- 
ments of the relative importance of the ten traits of the Purdue 
Rating Scale for Instructors. 

2. There is little halo effect in student judgments of teacher 
traits. 

3. Each of the traits comprising the Scale adds something 
to the total picture of the teacher as seen by the student. 











MODES OF EMPHASIS IN PUBLIC SPEAKING 


ARTHUR JERSILD 
Barnard College 


The public lecturer who aims to effect a careful balance of 
emphasis within the substance of his address might profitably 
raise the following questions: What point in a discourse is 
the most impressive, the beginning, the middle, or the end? 
What is the effect of repeating a statement one or more times, 
and how can this repetition be accomplished to best advantage? 
What is the value of emphasis devices ordinarly used in public 
speaking? And finally, how do various degrees and forms of 
these factors of impressiveness compare with one another in 
strength? 

The queries here raised demand a technique of investigation 
which complies with the informative character of a public 
address and at the same time lends itself to experimental 
measurement and control. The present study was designed to 
meet this requirement. A description of the experiment follows. 

The material presented to the “‘audience” in the present study 
consisted of a biographical sketch of a fictitious person. This 
biography contained seventy separate statements, each of 
which could be presented in connection with an emphasis 
device. Experimental measurement was obtained by means 
of a test for immediate recall. The methods of control will be 
discussed at a later point. Ten groups of college students, 
comprising a total of 253 persons, served as subjects. 

In a given presentation of the material the following experi- 
mental factors were under observation: 

The factor of Position—including Primacy, represented by 
the first three statements in the discourse; Recency, represented 
by the last three statements; and the Middle position, repre- 
sented by statements falling intermediate between primacy and 


recency. 
611 














2 ae 


= aa C= A 


+ are age 


Rise FS Tait ato ead 


= 


Ea 


eta 





se a 


an ine ites ai tice eal cate Pad 








612 ARTHUR JERSILD 


Repetition—under the following conditions: 


2 concentrated repetitions. The 10th and the 60th statements 
were presented twice in immediate succession. 

2 distributed repetitions. The 11th statement was repeated after 
the 60th had been presented twice; the 35th was repeated after 
the 40th. 

One statement was given 3, another 4, another 5 distributed 
repetitions. 


Artificial Emphasis—the emphasis devices employed were 
designed to correspond to some of those customarily used in 
public address. They were: 


Verbal emphasis: ‘‘Now get this.’’ (Proactive emphasis calling 
attention to an item prior to its presentation); and “Did you 
notice that?’”’ (Retroactive emphasis directing attention to an 
item immediately after its presentation. ) 

Loudness: voice raised above customary amplitude. 

Gesture: arm raised in conventional gesture while statement was 
being presented. 

Banging the table: fist brought violently down upon the table while 
last word of statement was being spoken. 

Slowness of speech: speed of articulation retarded to half of normal 
rate of delivery. 

Pause: the interval of pause corresponded in length to the 
amount of time required for the presentation of the preceding 
statement. 


In conducting the experiment, the experimenter announced 
to each group that an exercise would be presented and that a 
test for recall would immediately follow. No further indication 
of the nature of the experiment was given. The experimenter 
made use of a typewritten sheet containing the material and 
the necessary cues to repetition and emphasis. By means of 
previous practice he had memorized the wording of each state- 
ment, had learned to adopt an even tempo in delivery, and to 
speak in a conversational tone of voice, so that the presentation 
had many of the ear-marks of an ordinary lecture from notes. 
Following the reading of the exercise the subjects were to!d to 
reproduce in writing as many statements as they could 
remember. 
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In order to control the factor of differences in memory value 
which might obtain for the various statements within a narra- 
tive account two procedures were adopted. The first of these 
consisted of rotation of the elements within the exercise, so 
that a different arrangement of the material was used with each 
of the ten experimental groups. By this means a different 
statement was used in connection with each of the experimental 
devices in each of the several presentations. Furthermore, a 
given statement which in one presentation occurred in con- 
nection with one experimental device would in other presenta- 
tions occur in connection with other devices, and in still other 
instances appear in a position near the middle of the exercise 
without bearing any relationship to any experimental factor. 

As a result of this rotation of the items, the immediate or 
numerical score for a given experimental device in the investiga- 
tion as a whole was obtained from a wide variety of statements. 
This “numerical” score consisted simply in an arithmetical 
count of the number of correct reproductions within all the 
groups shown for the statements in connection with which the 
device was used. 

A further precaution against differences in memory value 
consisted in a special scoring method. From this was derived 
what will be known as the “percentage” score. The score for 
a given experimental device when used in connection with a 
certain statement was calculated in terms of the recall record 
which obtained for that same statement when appearing in 
a position near the middle of the exercise (within the range from 
number 25 to 45) independently of an experimental factor. 
For example, a given statement occupies the first position 
(first degree primacy) in the exercise as presented to a certain 
group of 20 subjects. It is reproduced by 15 on the recall test. 
This same statement, when occupying an independent middle 
position with other groups, comprising a total of 100 subjects, 
was recalled by 50. The recall ratio for this statement in the 
primary position was 15:20; when in the middle position, 
50:100. From this equation emerges the percentage score 
for the first degree of primacy in the group named: 15:20: :50: 
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100, or 150. In other words, the standard memory value of 
a statement was represented by the recall record shown for 
that statement when occupying a middle neutral position with 
several groups of subjects. 

It can be seen from the above discussion that two separate 
scores, the “numerical” and the “percentage,” were obtained 
for each experimental device with each of the ten groups of 
subjects. In the present report the results obtained from the 
ten groups as a whole will be given. 


PATE: 


RESULTS 


In table 1 are given the results obtained by each of the two 
scoring methods. The “numerical” score represents the total 
count of the reproductions of statements in connection with 
which each of the experimental devices was used. The “per- 
centage” score is derived from a composite of all the ratios 
from which the separate group results were computed. The 
scores have been arranged in order of merit. 

The results shown in table 1 give concrete answers to the 
questions raised at the beginning of this report. The devia- 
tions between the two separately determined orders of scores 
that appear are of a minor character, and the general findings 
stand out clearly above them. In the ensuing discussion only 
the numerical score will be referred to except in those instances 
where there is a noteworthy discrepancy between this score 
and that obtained by the special scoring method. 


ie a A Oe ee ine TIES ay 


DISCUSSION OF THE RESULTS 


To the query as to which point in a discourse is the most 
impressive the results provide an emphatic answer. The 
respective scores for statements occupying the first three posi- 
tions in the exercise are 192, 163, and 133; the corresponding 
scores for the final positions are 139, 126, and 121; the record 
for statements falling intermediate between the first and the 
last three is 105. The beginning and end positions are clearly 
more advantageous than the middle. But more striking than 
this is the decided superiority of the beginning as compared 
with the end. 
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TABLE 1 
Order of merit arrangement of the scores for the various experimental 


devices as determined by the two scoring methods (10 groups, 
253 subjects) 








5 repetitions 
4 repetitions 
3 repetitions 
Primacy—first statement... . 
Verbal emphasis—‘‘Now get 


Primacy—second statement. 
2 distributed repetitions 


Verbal emphasis—‘‘Did you 
notice that?’’ 


Recency—last statement..... 

Primacy—third statement. .. 

Recency—second from last 
statement 

2 concentrated repetitions 


EE IEG OT 


Recency—third 
statement 

Gesture 

2 concentrated repetitions 


from last 


Normal—statements inter- 
mediate between primacy 
and recency, unemphasized. 

Slowness of speech 





NUMERICAL 
SCORE 


PERCENTAGE 








5 repetitions 
4 repetitions 
3 repetitions 
Verbal emphasis—‘‘Now get 


Primacy—first statement... . 


2 distributed repetitions 


Primacy—second statement. 


2 distributed repetitions 


Verbal emphasis—‘‘Did you 
notice that?’’ 


2 concentrated repetitions 


Primacy—third statement... 
Recency—last statement... . 


Loudness 

Recency—third from 
statement 

Recency—second from last 


last 


etatement........0....-05- 


Gesture 
2 concentrated repetitions 


Middle position—range from 
number 25 to 45, unempha- 
sized and unrepeated 


Slowness of speech.......... 




















616 ARTHUR JERSILD 


The high scores shown for the primary positions become all 
the more impressive in view of the method of measurement 
used in the present experiment. In presenting materials in 
a test for immediate recall, the time interval separating the 
presentation and the recall test is longer for iterns which fall 
at the beginning of the exercise than for those which occupy 
the end position. The elapse of time would ordinarily lead 
to a decline in the strength of the impressions formed, and so, 
on the face of things, the final statements should have the 
advantage by virtue of theirrecency. But in spite of this advan- 
tage for the final position the beginning statements stand out 
far in the lead on the recall test. 

The significance of this finding for the public speaker is 
obvious. The introductory remarks of an address leave a 
stronger and more permanent impression than those which 
follow. The speaker will do well to take advantage of this 
situation. If he is sure of the good will of his audience let him 
release a salient feature of his address at the very start To do 
so will not only give added strength to his assertion but also 
invest subsequent remarks with the force of that which has 
gone before. If he must first win the sympathy of his auditors 
let him resort to such cajolery, obsequiousness, or humor as 
will be calculated to arouse a favorable attitude. To indulge 
in halting or irrelevant phrases at the crucial introductory 
point would be most inopportune The first impression leaves 
its lasting effects. 

With regard to the effects of repetition the results are similarly 
clear. Three or more repetitions score higher on recall than 
do any of the other devices. Yet it appears that the value of 
increased repetition does not by any means rise in proportion 
to the number of added repetitions. The average score for one 
presentation is 105; the highest score for two repetitions is 175; 
for three, the score is 211; for four, 234, and for five, 244. Five 
repetitions are in no wise five times as effective as one presenta- 
tion; nor are four repetitions twice as effective as two. 

The relatively low scores for four and five repetitions accord- 
ing to the numerical scoring method are in part due to the fact 
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that in some instances three repetitions were sufficient to bring 
about unanimous recall for some statements, and as a result 
the record on a test for immediate recall gives no indication of 
the effect of repetition continued beyond that point. This 
inadequacy of the numerical scoring method could in many 
instances be corrected somewhat by the use of the percentage 
method. The percentage scores accentuate more clearly the 
difference between the values of three, four, and five repetitions. 
But even so, the effect of increased repetition appears to follow 
the “law of diminishing returns.” 

An interesting result appears in the various scores for two 
repetitions. The two cases of concentrated repetition scored 
119 and 126, whereas the scores for two distributed repetitions 
were 163 and 175. It is clearly indicated that to repeat a 
statement twice in immediate succession is far less effective than 
to allow some time to elapse between the first and second pre- 
sentation. If the speaker aims to impress his point by means of 
reiteration his most effective procedure will be to introduce 
the repetitions at spaced intervals rather than to exhaust his 
repetitions at a single stage of his address. 

We come now to the devices for artificial emphasis. All of 
these, with the one exception of speaking very slowly, have a 
positive effect. 

The most valuable of all the emphasis devices are those of 
verbal design. The comment, “Now get this,” impresses an 
item more effectively than two distributed repetitions, and, on 
the basis of results from both scoring methods, is as effective as 
the first degree of primacy. The remark, “‘did you notice 
that?” is superior to other remaining forms of artificial empha- 
sis and surpasses two concentrated repetitions. 

Both of these remarks were chosen to represent a form of 
emphasis frequently used in public address. To be sure, the 
speaker will usually frame his accessory emphatic comments in 
terms more subtle than those here employed. But the present 
representatives, in spite of their crudity, demonstrate a high 
degree of impressiveness. This result can be accounted for in 
part, no doubt, on the assumption than an audience will gauge 
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the relative importance of items within an address largely in 
terms of the stress, either explicit or implied, placed upon them 
by the speaker himself. It appears that verbal stress is more 
advantageous than that effected by gesticulation, loudness, or 
other devices employed in the present study. 

Second to verbal emphasis is the device of a pause. The 
relatively high score which obtains for this device is interesting 
in view of the fact that the time consumed in presenting the 
statement and in the ensuing pause corresponded in length to 
the time consumed by the device of two repetitions and that of 
speaking very slowly. The scores for two concentrated repe- 
titions are 119 and 126, while that for a pause is 143. It appears 
that an interval of delay serves to impress an item more strongly 
than to repeat the item immediately after its first presentation. 
But the revival of an impression effected by two distributed 
repetitions is more advantageous than the pause, as shown by 
the score for spaced repetitions. 

As compared with a pause, and other devices as well, the 
expedient of speaking very slowly fares badly. Not only does 
this device rank lower than all other forms of emphasis, but it 
actually operates as a detriment to recall. The score for 
unemphasized statements is 105, that for statements spoken 
very slowly is 82. 

The device of slowness of speech is not infrequently used as 
a means of emphasis in public address. In using this expedient 
the experimenter tried to adopt an impressive intonation, as 
though imparting a weighty truth. The effect would perhaps 
have been more salutary if the content of the statements so 
voiced had been as ponderous as their mode of delivery. But to 
this it might be answered that no statement in the exercise was 
so momentous as to merit, for example, a sweeping gesture, or to 
deserve the violent gesticulation of banging the table with one’s 
fist. Yet both of these devices, although relatively of low 
value, stand out as positive aids to recall. It seems that slow 
delivery detracts severely from the force and impressiveness 
which obtains for items that are presented at a normal rate. 

The device of making a gesture and that of banging the 
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desk have already been referred to. Both devices scored higher 
than unemphasized statements, as shown by the total scores. 
In the separate group results, which are not reproduced here, a 
good deal of fluctuation in the scores appears, and both devices, 
in four of the ten instances, scored lower than neutral items. 
From this observation it can be inferred that these forms of 
gesticulation may at times be harmful rather than advanta- 
geous. To what extent these methods of emphasis serve as 
aids by contributing to the animation of the delivery as a 
whole, the present results do not reveal. 

The effect of increased loudness is more beneficial than that 
of the gesture or the bang. The statement which is stressed 
by raising the amplitude of the voice has the advantage over 
statements which are presented with normal intensity. 


GENERAL SUMMARY 


The present study raises the question as to the value of 
various forms of emphasis in public speaking as effected by 
repetition, position, and special emphasis devices. An inform- 
ative exercise embodying these features was presented in a test 
for immediate recall. The effect of the emphasis devices was 
determined by the score which obtained for the emphasized 
statement. The results show that: 

The most effective, although not the most economical, form 
of emphasis is repetition to the extent of three or more presen- 
tations. The benefit arising from repetition does not increase 
in proportion to the number of added repetitions. 

Repetition is most effective when the several presentations 


1 All of the emphasis devices here under observation, it will be 
noticed, are studied with reference to their effect on the fixation of the 
isolated statements in connection with which they are used. The results 
do not give any indication of the effect of these devices in contributing 
to the impressiveness of the discourse as a whole. Woolbert (Jour. 
Applied Psych., 1920, 4, 162-185) reports a study on the effects of various 
modes of public reading. He employed eleven different modes of read- 
ing, involving changes in pitch, intensity, time, and quality of the vocal 
presentation. His results are given in terms of the effect of various 
modes on the impressiveness of the exercise as a whole. 
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are separated by intervals of time. One of the least effective 
forms of emphasis is to repeat an item immediately following 
its first presentation. 

The most impressive point in a discourse is the opening 
statement. The first statements have a distinct advantage 
over those which come last. To surpass the effect of primacy 
it is necessary to resort to three or more repetitions or to intro- 
duce special verbal emphasis devices. 

Of the more artificial forms of emphasis, the device of using. 
verbal comments which direct attention to an item is most 
effective. Next in effectiveness comes the device of introducing 
a short pause; following this, that of raising the voice above 
the accustomed amplitude; gesticulation, by a gesture and 
banging with one’s fist, comes next in order, and in general 
has a beneficial effect. The device of speaking very slowly not 
only stands lowest in effectiveness but has a decided negative 
effect. 





CORRELATION BETWEEN INTELLIGENCE TEST 
SCORES AND SUCCESS IN CERTAIN RaA- 
TIONAL ORGANIZATION PROBLEMS' 


K. C. GARRISON 
George Peabody College for Teachers 


In the majority of the experiments that have been carried 
out on learning there seems to be but little stress on the neces- 
sity for rational behavior. Most of the experiments deal with 
practice effects in learning a task more or less mechanically, the 
main emphasis being placed on the acquirement of skill. Learn- 
ing in the case of acts of skill has usually been explained on the 
basis of the repetition of a particular act. Gross individual 
acts, according to this conception, can be controlled volunta- 
rily,? while repetition improves the performance—e.g., ball-toss- 
ing, mirror writing, memorizing non-sense syllables, sorting 
cards, and so on. Learning in the case of such activities is 
usually not considered as involving ideational activity; more- 
over this type of learning is not ordinarily explained on the 
basis of “trial and error” activity. It is this type of activity 
with which most investigations dealing with learning have 
been concerned. Hamilton’ and Yerkes‘ have worked out 
multiple choice methods and applied these in the study of 
animal learning. According to these methods the subject is 
confronted with a situation that may be reacted to in several 


1 Read before the American Association for the Advancement of 
Science, Section I, Nashville, Tenn., December 27, 1927. 

? Most investigations seem to overlook the implications of an arbi- 
trary “‘will’’ involved in this assumption. 

* Hamilton, G. V., Study of Trial and Error Reactions in Mammals, 
J. of Animal Behavior, 1911, i, 33. 

‘ Yerkes and Students (reviewed by Hunter in Psychol. Bull., 1916, 
xiii, 327-330). 
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different ways, one of which will lead to “success,” generally 
food. The Rational Learning Test as devised by Peterson is 
an outgrowth of these multiple choice methods of work, and 
especially from a consideration of the five differentiated types 
of reactions as given by Hamilton.*® 

A kind of learning that is often regarded as peculiarly different 
from “trial and error” adjustment is ideational activity, like 
reasoning, perceiving, etc. This kind was early conceived’ as 
the distinctly human method of learning. The recent experi- 
mental investigations in comparative psychology have shown 
that human and animal types of learning cannot be clearly 
differentiated. Thinking cannot be regarded as something 
discrete that goes on in isolation from the various interacting 
part-processes of the organism. The general nature of one’s 
experience during thinking is of some problem-solving type. 
This may take on a form of deliberation, involving linguistic 
signs or symbols; or it may assume an ill-guided hit-or-miss form. 
Trial and error behavior is manifested in both types of activity; 
but in the former case the conflicts are among systems of ideas,* 
which are codrdinated for larger and more far reaching adjust- 
ments.» The Rational Learning problem or test (1917), 
followed later by the mental maze, is an attempt by Peterson 
to present problems of thinking in such a way that the situation 
at any stage can be objectively and accurately measured. 
The Rational Learning test has been used rather extensively 
in the Jesup Psychological Laboratory at Peabody College and 
to a degree in some other laboratories. It seems to have a 
closer relation to intelligence tests than do ordinary learning 
tests. The reactions required of the subjects are to associate 
the numbers 1 to 8° inclusive, assigned in random order, with 


5 Peterson, Jos., Experiments in Rational Learning, Psychol. Bull., 
1918, xxv, 443-467. 


* Hamilton, op. cit., p. 61. 

7 Descartes, Rene, Les passions de l’ame, 1650. 

* For an objective conception of ideas see Peterson, The Functioning 
of Ideas in Social Groups, Psychol. Rev. 1918, xxv, 214-226. 

* That is, in the case of the eight letter form. Five, six, eight, ten, 
and fifteen letters have been used in the various investigations. 
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the first eight letters of the alphabet. ‘‘This is to be done by 
means of a series of guesses the range of which may be greatly 
limited by the use of a rational organization of the situation. 
Each subject completes the learning at a single practice period, 
varying in length inversely with the subject’s ability, roughly 
speaking. As will be seen, the subject is forced to react to a 
changing situation, each response making it different to a slight 
degree by limiting the range of probability.’’!° 

Another test used in this study was the Rational Analysis 
problem adapted by one of Peterson’s students (Mr. Atkinson) 
from the Rational Learning test. According to the method 
used in this investigation S was given a sheet of paper with the 
following instructions: 


Examine for a moment the eight pairs of letters below. . . . . Now, 
for the purpose of this experiment I have chosen one letter from each 
pair, eight letters altogether. Your task is to discover the eight letters 
that I have selected. We proceed in this way: You choose one letter 
from each pair, writing them down on the first line. Then read them off 
to me and I shall tell you how many (but not which particular letters) 
are wrong. You record the number wrong in the space provided at the 
end of the line so that you will not have to remember it for later use as 
you proceed with other selections. Then, make another choice of eight 
letters, read them to me, and I shall tell you how many of them are 
wrong. Record the number of errors as before, and continue guessing 
and recording until you find the eight letters which I have selected. 

Your score depends upon (1) the total time it takes you, (2) the num- 
ber of selections needed to find the eight righ letters, and (3) the num- 
ber of wrong-letters (“‘errors’’) you select. Ask no questions. Re-read 
these instructions if necessary. Say, ‘‘All right’’ when you are ready to 
begin. 

1 2 3 4 5 6 7 8 
AB C-D EF GH I-J K-L M-N O-P 


S wrote down each letter selected under the pair of letters 
from which the selection was to be made.“ Also, the number 
of errors was recorded for each repetition. The results on the 


© Peterson, Jos., Experiments in Rational Learning, Psychol. Rev., 
1918, xxv, 443. 

1 See table 1 for record of subject K, and method of performing the 
experiments. 
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TABLE 1 


Showing the record of subject K and the method of performing the Rational 
Analysis Problem 


























1 2 3 4 5 7 8 | wom 
AB | cD | EF | GH] 1-3 | KL | M-N | O-P |danoms 

1 A D E H I L M P 4 
2 A Cc E G I K M O 2 
3 A Cc E H J K M 0 2 
4 B Cc F H J K M oO 4 
5 A Cc E H J L N O 4 
6 A D E G I K M oO 1 
7 A D E H I K M O 2 
8 A Cc E G I K M O 2 
9 A Cc E G J K M O 1 

10 A D E G I K M O 1 

il A D E G J K M O 0 














Name: Subject K. Class: Sophomore. 


Experimenter: Garrison. 
Score: Time, 236 seconds; repetitions, 11; errors, 23. 


TABLE 2 
Showing the record of subject O and the method of solving and recording 
results of Form B of the Rational Analysis Problem 


Date: November, 1927. 








"oY A-B-C | D-E-F | G-H-I | J-K-L | M-N-O | P-Q-R SCORE 
1 B F G J M Q 7 
2 A D H K N P 6 
3 Cc E I L O R 5 
4 a F H L N P 5 
5 Cc F I L N R 4 
6 B E G J N Q 10 
7 A D G J M P 8 
8 B E H K N Q 7 
9 B E G J N Q 10 

10 A E G J N P 10 
11 A F G J N Q s 
12 B E G K N Q 9 
13 C E G K N P 10 
14 B E G J N P 12 


























Name: Subject. Class: Sophomore. Date: 
Experimenter: Garrison. 
Score: Time, 520 seconds; repetitions, 14; errors, 57. 





November, 1927. 
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test are scored on the basis of time, errors, and number of 
repetitions required.” 

Form B of the Rational Analysis problem embraces the prin- 
ciple of the analysis test which has just been described. Here 
the subject is given*several groups of letters, each group being 
composed of three letters. The letters in each group have been 
numbered 0, 1, and 2. The task S is confronted with is to 
select the letters numbered 2 in order that the sum of the letters 
selected will give the maximum score. § calls out the letter 
selected from the group, at the same time writing the letter 
selected in the column under the group from which the letter 
was selected." E tells S the score (sum of the values assigned 
the letters) made for each repetition and S records this score. 


AOQseO 


oa i =: O 


Fig. 1. ANALoGYy Test 


Form B of the Rational Analysis problem is scored similarly to 
the Rational Analysis problem. Errors in Form B are deter- 
mined by the total value of the letters selected, being inversely 
related to the value of the letters chosen. 

One other test involving thinking and the reaction to rela- 
tions was used. This problem is a form of an analogy test, and 
was taken from the tests by Thurstone for The American Coun- 
cil of Education. In this case twenty-four different analogy 
problems were used, and the test was administered as a group 
test. Figure 1 illustrates the type of problem situation in- 
volved in this. The first two of the three forms on the left side 
of the page have a definite relation to each other. The task of 


12 See the score under the record of subject K. 


18 See table 2 for record of subject O, and Method of Solving the 
Problem. 
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the subject then is to detect this relationship, and to find the 
form from the other five forms that is similarly related to the 
third form. 

The scores made on these tests were correlated with the scores 
made on two group intelligence tests,—namely, Otis Self-Ad- 
ministering Tests of Mental Ability and the Detroit Advanced 
Intelligence Test. Both intelligence tests have established 


TABLE 3 


Correlations of the test facicrs af two eight-letter forms of the Rational 
Learning Test (N-49) 


UNCLASSI- PERSE- 
‘a FIED — VERATIVE| scORE* 
ERRORS ERROR 





TIME 





0.75+0.04/0.62+0.06/0.74=0.04)0.52+0.07/0.34+0. 08/0. 6840.05 
0.61+0.06/0.60+0.06/0.55+0.07|0.43+0.08|0.27+0.09)0.61+0.06 
0.700. 05/0. 700. 05\0.70+0.05|/0.63+0.06/0.30+0.09/0.67+0.05 
0.60+0.06/0. 480. 07/0.52+0.07|0.62+0.06/0.32+0.09/0.54+0.07 
Perseverative error.......... 0.390. 08/0. 220. 09)0.32+0.09|0. 180. 09/0. 24+0.09/0.30+0.09 
PT arta ns datatknetnitees 0.72+0.05/0.63+0.06/0.67+0.05)0.59+0.06/0.32+0.09)0.67+0.05 














* Score indicates the average of time, repetitions and unclassified errors. 


TABLE 4 


Correlations between the test factors of the eight-letter Rational Learning 
Test and certain intelligence test scores 





RATIONAL LEARNING 
INTELLIGENCE TESTS 





Time Repetitions Unclassified Logical > apna 
Otis 8. A. Tests. .|0.37+0.07/0.21+0.08/0.26+0.08/0.22+0.08/0.07 0.09 
Detroit Ad- 


vanced....... ./0.31+0.07|0.41+0.07/0.37+0.07/0.25+0.08/0.14+0.08 




















norms and have been found to have a high coefficient of re- 
liability. High reliability is not, however, a guarantee of high 
validity. By coefficient of reliability is meant the degree of 
consistency with which a test measures that which it purports to 
measure. The various measures that are being widely used in 
the investigations of the higher thought processes need a 
thorough analysis as to the factors they are actually measuring. 
It is very probable that most intelligence tests are omitting 
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important intelligence factors that are measured by some of the 
various performance tests. 

The writer found in a former study“ that the test factors of 
the Rational Learning problem had a rather high coefficient 
of reliability, as is shown in table 3. When the reliability of 
these factors is compared with the reliability of such problems 
as the pencil maze, star tracing and the like, the reliability of 
the Rational Learning problem stands out as quite significant. 
Furthermore, it was found that all the test factors were posi- 
tively related to scores on the Army Alpha and the Otis Self- 
Administering tests and to the number of “points’’ earned in 
Psychology. Table 4 shows that the present study" tends to 
corroborate earlier findings with the Rational Learning prob- 
lem, i.e., a positive relationship was found between each of test 
factors of the Rational Learning test and intelligence test scores. 
Partial and multiple correlations are not given between the 
Rational Learning test factors and intelligence test scores for 
this study; but investigations’® have shown that the sizes of 
these correlations may be considerably increased by an optimum 
weighting of the test factors of Rational Learning. Partial 
correlations further show that time and repetitions measure 
common factors that are also being measured by the Otis test. 
Of the three kinds of errors, it has been shown that logical errors 
are the most important in the eight-letter form with respect to 
the relation that the different kinds of errors has to intelligence 
test scores and to average grade points. By combining equally 
centile scores'’ of the logical errors and time a correlation of 


4 Garrison, K. C., An Analytic Study of Rational Learning, George 
Peabody College For Teachers, Contribution to Education, Number 44, 
1928, p. 27. 

46 Garrison, K. C., pp. 34, 35, and 44. 

16 Garrison, K. C., op. cit., p. 35; Haught, B. F., The Interrelations of 
Some Higher Learning Processes, Psychol. Monog., 1921, xxx, No. 6, 
pp. 16-17; Peterson, Jos., Tentative Norms in the Rational Learning 
Test, J. Applied Psychol., 1920, iv, 250-257. 

17 These are scores on a scale ranging from 1 to 100 which are based 
not upon rank, but upon the area under the normal probability curve. 
For a further discussion and interpretation of this see page 24 of the 
writer’s work already referred to. 
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0.42+0.06 was found; this has been found to be one of the 
simplest and almost as valuable a combination of the test fac- 
tors as one can make for predicting intelligence test scores. 

Tables 5 and 6 give the correlations between the intelligence 
test scores and the test factors of the two forms of the Rational 
Analysis tests. None of the correlations was found to be re- 
liable. The correlations of tables 5 and 6 are low and statisti- 
cally unreliable, tending to show that there is no relation be- 
tween the intelligence test scores and the Rational Analysis 
scores. 

TABLE 5 


Correlations between the test factors of the Rational Analysis Test and 
certain intelligence test scores 





RATIONAL ANALYSIS 
INTELLIGENCE TESTS 





Time Repetitions Errors 





Otis 8S. A. Tests —0.10+0.11)—0.01+0.11) 0.00+0.11 
Detroit Advanced —0.20+0.10)—0.09+0.11) 6.05+0.11 














TABLE 6 


Correlations between the test factors of Form B of the Rational Analysis 
Test and certain intelligence test scores 





FORM B--RATIONAL ANALYSIS 
INTELLIGENCE TESTS 





* Time Repetitions Errors 





Otis 8S. A. Tests 0.10+0.12}—0.01+40.12} 0.08+0.12 
Detroit Advanced. —0.08+0.12)—0.11+0.12)/—0.04+0.12 











Trial and error activity was manifested by all the subjects 
during the early stages of the learning process. One subject 
on Form A of the Rational Analysis test and two on Form B 
worked on the problem for forty-five minutes and yet failed to 
see a method of solution. Their search for a solution of the 
problem with which they were confronted was wholly of a trial 
and error type, and the responses followed the order of chance 
rather than being guided by previous responses. The results 
from these subjects were discarded in our correlation. General 





TEST SCORES AND SUCCESS IN PROBLEMS 629 


observation also showed that in most cases the subject hit 
upon the idea of how to solve the problem immediately following 
a very good chance record. One could not with any accuracy 
predict at what stage in the problem this bit of insight would 
flash upon the subject. 

The results on the Analogy Form test were found to give a 
correlation of 0.50+0.06 with the Otis Tests and 0.41+0.06 
with the Detroit. This test was given as a group test and, as 
correlations show, measures some factors positively related to 
certain factors in the intelligence tests. 

Grade points made from objective tests given weekly" 
were correlated with the learning tests. All correlations were 
found to be positive, the highest being 0.44+0.06 between 
repetitions on the Rational Learning test and grade points,'® 
while the next highest was 0.43+0.07 between the Analogy 
Form test and grade points. There was little difference in the 
size of the correlations found between the Rational Analysis 
and the Modified Form of the Rational Analysis with grade 
points. The correlations between the Otis and the Detroit 
intelligence test scores, on the one hand and grade points on 
the other were 0.34+0.08 and 0.30+0.08, respectively. 

In summarizing the correlation findings between the learning 
problems and intelligence test scores, the following factors are 
revealed: 

1. Rational Learning and the Analogy Form test scores give 
reliable positive correlations with intelligence test scores. No 
relationship was found between the Rational Analysis tests and 
intelligence tests scores.”° 


18 This method of computing grade points was adapted from a method 
used by Peterson. For a description of this method see Peterson, Jos., 
The Rational Learning Test Applied to Eighty-one College Students, 
J. of Educ. Psychol., 1920, xi, 137-150. 

19 Similar results with respect to the test factors of the Rational 
Learning tests are reported by Garrison, K. C., op. cit., p. 44. 

2° One cannot conclude from these findings that the Rational Analysis 
problems do not measure any of the factors found in intelligence. It is 
probable that they measure important factors that are not measured by 
the two intelligence tests used in this study, or that the tests are so 
hard for the subject as to bring about only guessing. 
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2. Positive correlations were found between all the test fac- 
tors of each learning problem and grade points. Thus, grade 
points depend partially upon ability as measured by the Analy- 
sis tests but the latter is not dependent upon intelligence as 
measured by the tests. 

3. The Rational Learning and the Analogy Form tests are 
better than the intelligence test scores for predicting grade 
points in psychology classes. 

4. The Rational Analysis and Modified Rational Analysis 
tests would probably serve better to illustrate and study “types 
of thinking” or “general reactive tendencies’ rather than to 
analyze specific abilities. 


*1 Hamilton, G. V., A Study of Trial and Error Reactions in Mammals, 
J. Animal Behavior, 1911, i, 33; Yerkes, R. M., The Study of Human 


Behavior. Contributions from Psychopathic Hospital, Boston, Mass., 
No. 25, 1913. 





POWER AND SPEED: THEIR INFLUENCE UPON 
INTELLIGENCE TEST SCORES 


FRANK 8. FREEMAN 


Cornell University 


In studying the respective contributions of power and of 
speed to intelligence test scores, it has been stated that, when 
the time limit is doubled, if the correlation between single and 
double time scores is high, the test is then necessarily a speed 
test. Such a conclusion seems unwarranted. If the test is 
purely a test of rate, then the correlation will be high zf the 
items are sufficient in number to permit few or none of the 
subjects to complete them all within the time limit. It is clear 
that if the time allowance is of such length that everyone has an 
opportunity to attempt every item, time ceases to be a factor, 
and power only is significant. However, the converse—that 
a high correlation means the test is a measure of speed alone, or 
chiefly—is doubtful; in fact, highly improbable. 

If the time is doubled, but if all the subjects are not enabled 
thereby to attempt every item, the correlations for single and 
double time will still be ambiguous, for both speed and power 
will have entered into the results achieved. The analogies 
which have been drawn with foot races are highly questionable. 
It has been maintained by some that if we compare the distance 
run by competitors during ten seconds and then that run in 
twenty seconds, if the same rate were maintained by the runners, 
doubling the time would not affect their relative scores; and, 
therefore, it is concluded that speed only would be measured. 
But what is being overlooked here—even granted that we 
accept the doubtful analogy—is the fact that power is necessary 
to maintain speed over increased distances. It would be more 
nearly to the point to talk of long distance runs; from one-half 
to two miles, for example. 
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In the case of any intelligence test the time factor should 
be eliminated in the retesting if we wish to examine the parts 
played by power and speed. By so doing we are enabled to 
derive correlations for scores achieved with a time limit imposed, 
and scores achieved where there is practically no time limit, and, 
therefore, where speed plays no part. If under these conditions 
there is a high correlation, then we may justifiably conclude 
that the test is primarily one of power, and that the time limit 
established by the author is one of convenience chiefly. If, 
however, there is a low correlation we may conclude that the 
test with its original time limit measures speed for the most 
part, inasmuch as it appears that the slow, plodding, accurate 
individual is able to score much higher if he is given sufficient 
time to undertake the solution of each problem, whereas the 
rapidly working subject profits little or none from the increase 
in time, for he will reach an impasse beyond which progress will 
not be possible even with the added opportunity of attempting 
the solutions of the increasingly difficult items. 

The data presented in this paper were derived from the testing 
of one hundred school children with the Dearborn Group Test, 
Series II, Examination C. This examination consists of four 
parts: (1) picture sequence, (2) word sequence, (3) form com- 
pletion, (4) sentence completion. The children tested were in 
grades 4 to 7, although of the 100 subjects, 92 were in grades six 
or seven. The range of age is 9~-9 to 15-2; of M.A. from 7-10 
to 16-10. Fifty-two were girls, and forty-eight boys. 

The group test was first administered exactly according 
to the author’s directions, with the set time limit. Several days 
later the test was once more administered but the time was 
doubled ; and the subjects were told that there would be no need 
for haste because there would be sufficient opportunity for all 
to finish. It was observed that under these revised conditions 
everyone had enough time to attempt every item, and that the 
large majority were enabled to go over their answers in order 
to check them as had been suggested by the tester in the revised 
directions. 

The correlations given in table 1 were found for single and 
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double time scores. The coefficients were obtained, of course, 
by using the raw scores of both sets of tests. From these cor- 
relations it appears quite clearly that the test as a whole is a 
measure of power rather than of speed. The coefficient of 
.88 + .01 is remarkably high and at the same time furnishes 
an index of the reliability of the test. 

The coefficients for the scores of the parts are very high in 
three cases (Parts I, II, and IV), and only moderately high 
for Part III, form completion. It may be that because for 




















TABLE 1 
Pp 
RS ar aE: 2: 0 Ca ee 81 +.01 
MS he Ned diuconin w 3, Be Wav ok Gaia Chino bh onic abies wv auleaten .76 +.02 
ES bh DB scoy ce PET 5s Koo RMN oeaeiadaip weed Curedetee .64 +.02 
og FR ee 3 ee le Ree ey ae ee ae 91 +.01 
ee 5 i cats Soba oe ea cee de uee cde Se8nte 88 +.01 
TABLE 2 
Range in scores 
REGULAR DOUBLE 
TIME TIME 
RAG 4 sos chhnedu44 cae se TOA eA ees hadaebes 1-14 1-15 
RM A Ls sin UNS ware ho dees oa ee es liek anes Pe 0-11 0-12 
Fok ox Gila u- eine 00 has CS a ee a edad Cao 0-15 0-15 
BE ics. 00s AKA ro Sen a aN aR ENT Face ON cakaae 2-28 3-32 
SN CL a iad eee aad ean ead 5-59 9-71 











form completion the time limit is rather brief, the element of 
speed does play a greater part than in the other three sections. 
But the very nature of the form completion is of the sort which 
we might expect to be most significantly influenced by added 
time; for the subject has before him a geometrical figure to 
be divided according to certain directions. Thus with the 
added time he can experiment with and study the lines he has 
drawn. Buta correlation of .64 + .02 is certainly sufficiently 
high to justify the conclusion that even Part III is principally 
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a test of power; for if it were not we should get a coefficient 
very close to zero. 

The scores themselves also support the correlations in showing 
the Dearborn Group Test to be a measure chiefly of power, as 
will be seen from table 2 which presents the range of score for 
each part and for the total, with regular time and time doubled. 

These scores in themselves would not be significant; but 
when regarded in conjunction with the preceding coefficients 
of correlation they are important, for these data indicate that 




















TABLE 3 
M;* M2 8.Da 8.De 
Ee te Te 8.4 9.6 2.9 2.8 
TS Pa ee A ed oe 5.5 6.5 2.5 2.6 
PE ia os s&h ceemkack saad 6.4 9.9 4.5 4.1 
oe I a ee ie eae 13.9 16.2 5.4 6.0 
NE os ks cag det oo Veen 36.1 44.5 11.5 12.6 





* Subscript 1 for regular time; subscript 2 for double time. 








TABLE 4 
POSSIBLE 
MAXIMUM SCORE 
BABE ° SNES a Be ls: C2 | ro a es 15 
8 ER NA ee cee ip! ae a rr 15 
SF eS SS SERRE OR Pee eae eae 15 
I ey a ula 34 
ee ee ee edaneebuews 79 








not only is the rank order much the same for the score obtained 
under regular and double time, but the scores themselves are not 
appreciably increased in spite of the lengthened time. 

Furthermore, table 3 giving the means and standard devia- 
tions is consistent with tables 1 and 2. 

Inasmuch as under the revised conditions sufficient time was 
permitted to allow all subjects to attempt each item and to 
check their replies, we might expect, if the test were one of rate 
of work, to find many of the scores at or close to the maximum. 
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But tables 2 and 3 taken with table 4 show that no such con- 
dition obtains. 

From the foregoing data it is reasonable to conclude that the 
Dearborn Group Test is a measure primarily and largely of 
power rather than speed. It seems clear, too, that the time 
limit does not make a test a measure of rate simply by virtue of 
the fact that there is a time limit. If the Dearborn test is 
typical of the modern group examinations then it may be 
maintained justifiably that they are measures of capacity 
chiefly in terms of power. At any rate this may be said of those 
group tests which are very much like the Dearborn. 

Furthermore, if the influence of rate and power upon test 
scores is to be studied, the effect of a time limit must be removed, 
for it is only by so doing that we can arrive at unambiguous 
results. Of course, it may be desirable to measure rate of 
performance as has been so frequently done; but even in those 
instances which are conceived for that purpose great care is 
necessary to recognize the place of power in performances 
which are apparently tests of rate. 























SPEED AND ACCURACY AS FACTORS IN OBJECTIVE 
TESTS IN GENERAL PSYCHOLOGY 


HOWARD P. LONGSTAFF anv JAMES P. PORTER 
Ohio University 


From the earlier administration of fourteen objective tests in 
General Psychology wide individual differences in both speed 
and accuracy were clearly evident. In view of the increasing 
number of studies in this field and the multiplicity of conflicting 
opinions it seemed desirable to subject our data to statistical 
analysis. One hundred-eighty-six university students studying 
General Psychology have furnished our data. The tests were 
fourteen in number and of the multiple-answer three-phrase 
type. Each test consisted of fifteen statements or problems in 
the laboratory examination substituted for some of the state- 
ments so as to be roughly equivalent for five of the total of 
fifteen. The examinations were given one each week on Friday 
throughout the semester beginning with the third or fourth week 
of the semester. 

The tests have been numbered in the order they were taken, 
the first laboratory test given was called Lab. Series I, second, 
Lab. Series II, etc. Laboratory Series II, III and IV contained 
problems on finding the median, Q,, Q; and the mean deviation. 
Series V and VI contained problems on solving correlation by 
the rank difference method and finding the probable error of 
rho. The speed was found by having all the students begin 
the examination at the same time. As soon as they had finished 
the papers were handed to the instructor. The time to the 
nearest minute required to underscore the selected phrases 
was taken as the measure of speed. 

Objective tests are recognizably more quickly answered than 
are the old essay type. Such being the case the question arises, 
—Is there any added significance to be attached to the time 
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taken in answering this type of question? Looking at the 
same question from a somewhat different angle,—Should they 
be scored for speed as well as for accuracy? Should the final 
score be the result of the number of correct statements plus or 
minus a certain weighting for speed? May the speed factor 
be entirely ignored? 

The literature on speed and accuracy as weighted factors in 
a final score is disappointingly meager. In analyzing college 
freshmen examinations King and McCrary! found that the 
correlations between speed and scores made on the different 
parts of the examination were very low. 

Mudge? investigating time and accuracy on mental tests 
found a few of the brightest students to be most accurate and 
a few of the dullest to be most inaccurate. The majority 
showed little relationship between speed and accuracy. Prob- 
ably if he had computed the correlation for the entire group his 
correlations would have been low. 

Ruch’ working on “‘Power”’ vs. “Speed” in the Army Alpha 
repeated May’s‘ experiment on the effect of lengthening the 
time allowed in taking the Army Alpha Test. He found that: 
(a) speed factors did not invalidate the test; (b) increased time 
did not enable dull subjects to catch up with the bright ones; 
(c) there was no increase in amount of overlapping of high and 
low groups with increased time. 

Myers® found that students could do just as good work in 
about two-thirds the amount of time usually spent if they were 
limited to time or worked under pressure. 


1 King, Irving and McCrary, James. 1918. Freshman Tests of the 
State University of Iowa. Journal of Educational Psychology, 9: 32-47. 

? Mudge, E. Leigh. 1921. Time and Accuracy as Related to Mental 
Tests. Journal of Educational Psychology, 12: 159-161. 

*Ruch, G. M. 1923. “‘Power’’ vs. “Speed’’ in Army Alpha. 
Journal of Educational Psychology, 14: 193-209. 

* Psychological Examining in the United States Army. 1921. 
Memoirs of the National Academy of Science, 15: 416. 

5 Myers, G. C. 1915. Learning Against Time. Journal of Educa- 
tional Psychology, 6: 115-116. 
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Thorndike’ in studying speed and accuracy in addition found 
the faster students more accurate. 

In summarizing their results it is clear that there is some 
disagreement. Some find no relationship between speed and 
score while others find at least a tendency towards an inverse 
relationship. 


TABLE 1 
The speed and accuracy scores as eighteen variables 





X, Average number of statements correctly answered in seven multiple- 
answer tests in lecture-text work in general psychology 

X, Average number of statements and problems correctly answered 
in seven multiple-answer tests in laboratory work in general 
psychology 

X; Average of X,; and X; combined 

X, Average time to the nearest minute on tests in X; 

X; Average time to the nearest minute on tests in X, 

Xs Average time to the nearest minute on tests in X; and X, combined 

X, Accuracy scores made on lecture-text examination, series II 

X, Accuracy scores made on lecture-text examination, series V 

X, Accuracy scores made on lecture-text examination, series VII 

Xi» Speeds made on lecture-text examination, series II 

Xi: Speeds made on lecture-text examination, series V 

X12 Speeds made on lecture-text examination, series VII 

X13 Accuracy scores made on laboratory examination, series II 

X14 Accuracy scores made on laboratory examination, series V 

Xs, Accuracy scores made on laboratory examination, series VII 

Xi» Speeds made on laboratory examination, series II 

Xi; Speeds made on laboratory examination, series V 

X:s Speeds made on laboratory examination, series VII 





We hear school men very authoritatively saying that the 
fast students make the best grades and the slow ones the poorest 
on objective tests. Statements of this kind are usually based 
on the assumption that if a student knows the subject in which 
he is being tested it should follow that he requires but a short 
time to make his answer. Needless to say this assumption 
merits confirmation from many sides. The whole matter of 


* Thorndike, E. L. 1915. The Relation Between Speed and Ac- 
curacy in Addition. Journal of Educational Psychology, 5: 537. 
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speed and accuracy is so involved and composed of so many 
different factors that one could speculate indefinitely and then 
probably be little closer to a solution. This study is a definitely 
controlled attempt to answer by the use of objective methods 
the question: Is there any relationship between the time used 


TABLE 2 
Inter-accuracy coefficients of correlation and P. E., 





Ti2 = 0.5543 + .034 T23 = 0.8772+ .011 
Ty3 = 0.8662 + .012 Tor = 0.46064 .039 
Ti7 = 0.5598 + .034 Tes = 0.3078 + .044 
Tis = 0.5734 p> od .033 Te9 = 0.4034 + .041 
Tr9 = 0.5483 + .034 Trois = 0.6656 + .027 
fTias = 0.4305 + .040 rou = 0.5303 + .035 
114 = 0.2625 + .046 15 = 0.6168 + .030 
Trius = 0.43344 .040 

Tsx = 0.5895 + .032 ‘rs = 0.19744 .047 
‘ss = 0.4909 + .037 Tro = 0.41264 .041 
rss «6¢= 0.6114 + .031 Tras = 0.3316 + .044 
f3i3 = 0.6183 + .030 ip as 0.2704 + .045 
sua = 0.4715 + .038 Tris = 0.3495 + .043 
Ts16 = 0.6025 + .031 

rss = 0.4012 + .041 Teas = 0.3296 + .044 
Teas = 0.2718 + .045 fois = 0.2791 + .045 
Tsu = 0.1985 + .047 fris = 0.3482 + .043 
Teas = 0.2973 + .045 

Tis.4 = 0.2602 + .046 Tuas = 0.2065 + .047 


Tiz.is = 0.1908 + .047 





in taking these tests and the number of items answered 
correctly? 

The speed and accuracy scores were arranged into eighteen 
variables as shown in table 1. Table 2 gives the inter-accuracy 
correlations. It can be seen that the tests correlate moderately 
positively. The average of the lecture correlations is 0.4488. 
The average of the laboratory correlations is 0.4117. When 
just the three separate lecture tests and the three separate 
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laboratory tests are correlated with themselves, we find the 
lecture average to be 0.3371, and the laboratory average to be 
0.2142. This means that there is little relation between any two 
tests of the series and a moderate agreement between any 
one test and the average of the seven of which it is one. This 
is shown by the average correlation between lecture average 
and lecture tests, 2, 5 and 7, which is 0.5605, while the same 
average for the laboratory tests is 0.6042. From these results 


TABLE 3 
Inter-speed coefficients of correlations (speed reliability coefficients) 





Tas = — 0.1529 + .048 Ts. = 0.24204 .046 Teun = 0.4741 + .038 


Tr16 = 0.1006 + .048 Tsa2 = 0.3360 + .043 "a6 = 0.4765 + .038 
Ts17 = 0.1585 + .048 Tsis = 0.3644+ .043 Teas = 0.5092 + .036 
Tsis = 0.2957 + .045 Ts10 = 0.4125 + .041 7647 = 0.5211 + .036 
Toro = 0.41924 .040 reig = 0.44724 .089 rei2 = 0.58734 .032 
Tau = 0.4518 + .039 Ts17 = 0.4890 + .037 Teéi0 = 0.6259 + .030 
Tae = 0.4682 + .038  .<. = 0.7774+ .019 

Torr = 0.50124 .037 

Trois = 0.25804 .046 rug = 0.1491 + .048 rinse = 0.26904 .045 
Tioar = 0.3408 + .043 ruar = 0.24844 .046 rizay = 0.35494 .043 
Tiois = 0.4810 + .088 riuas = 0.1351 + .048 riais = 0.4914+ .037 
Tiou = 0.5213 + .036 Tua: = 0.5525 + .034 

Tio. = 0.6100 + .031 

T6317 = 0.1503 + .048 Tizas = 0.2963 + .045 

Tieis = 0.2345+ .046 





we may conclude that the tests are fairly consistent in their 
measurement. 

Table 3 gives the speed reliability coefficients. The inter- 
lecture coefficients of speed range from 0.419 to 0.610 with an 
average of 0.5093. The inter-laboratory speed correlations 
range from 0.150 to 0.447 with an average of 0.2637. The 
lecture-laboratory speed coefficients range from —0.1524 to 
0.4914, the average being 0.2873. Thus it is seen that there is 
a fairly moderate positive correlation between the speed per- 











SPEED AND ACCURACY AS FACTORS IN TESTS 641 


formance on the lecture-text tests. The laboratory speed is 
much more irregular, as indicated by the coefficients of correla- 
tion, though the tendency is positive. The lecture-laboratory 
speed correlations are very low and show no definite tendency, 
some being negative, others positive. 

Knowing these facts we are not surprised to find very little 
relationship between the speed and accuracy factors. Table 4 
gives the speed accuracy coefficients of correlation. They range 


TABLE 4 
Zero order correlations between speed and accuracy and P. E., 





Tie = 0.0049 + .049 faa = — 0.0272 + .046 fa5 = =- 0.0544 + .049 

Ti10 = 0.0163 + .049 roa = 0.0220 + .049  rano = — 0.03404 .049 

iu = 0.0370 + .049 Tei8 = 0.0522 + .049 fs ==> 0.0284 + .049 

133 = 0.0622 + .049 Ta16 = 0.1122 + .048 T3132 = 0.0023 + .049 

n14 = 0.3830 + .042 fan = 0.0068 + .049 
Tair = 0.0088+ .049 
Tea = 0.0217 + .049 
fais = 0.0523 + .049 

Tss «= — 0.1239 + .048 nu = 0.2055 + .047 Ts16 = 0.0531 + .049 

rer = — 0.00552 .049) rons = — 0.0019 + .049 

ron = 0.07042 .049 roeis = 0.14184 .048 

Te. = — 0.0703 + .049 

Teas = 0.00644 .049 

Teis = 0.06774 .049 

Tr = 0.07784 .049 

Tes = 0.1160 + .048 

Te = 0.1417 + .048 





from —0.2055 to 0.3830, the average being 0.0669. As can be 
found from studying the table the range is not really indicative 
of the very low value of these coefficients. Twenty-three of 
the thirty correlations between speed and accuracy have 
one or two zeroes in the first and second places. Five have 
a “‘1” in the first place and only two have “2” or greater as the 
first digit in the coefficient. There is very little, if any, rela- 
tionship between the times required and the scores made in 
answering tests of the type herein used. Such being the case 
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it is not’ yet clear as to how final scores are to be made up of 
weighted speed scores combined with accuracy scores. 

We may conclude then,— 

1, The tests used in this study are moderately valid measures 
of some kind of responses—we hope a knowledge of psychology 
and skill in general. 

2. The speed factor is at no time more than fairly constant. 

3. The lecture tests have a higher reliability of speed per- 
formance than do either of the three combinations under 
consideration. 

4. There is little relationship between the speed on the lecture 
and the same on the laboratory tests and no definite negative 
or positive trend. 

The conclusions just arrived at do not mean that speed does 
not enter into taking tests such as ours. If anything they 
mean that speed does play a very definite part, some student 
being able to do as well as others in much less time; others 
using little time but making low scores through being inaccu- 
rate. According to Myers the whole process could possibly be 
speeded up but it is doubtful, in view of our findings, if the 
display of individual differences would be much affected. Some 
of the factors determining our low correlations are probably 
these individual differences. Ruch’s study ‘“Power’’ vs. 
“Speed” in Army Alpha would tend to support this conclusion. 
We believe that both our methods and findings help to point 
the way to a more scientific way of combining the speed and 
accuracy factors into an improved objective measure of student 
achievement at least in General Psychology. 





NOTES AND NEWS 


The Child Study Association of America held its Fortieth Anniver- 
sary Conference and Dinner at the Hotel Pennsylvania, New York City, 
on November 20, 1928. The introductory address was made by Mrs. 
Howard 8. Gans, President. Dr. Everett Dean Martin, Director of the 
People’s Institute, and Mr. Eduard C. Lindeman, Consulting Director, 
National Council of Parental Education, acted as chairmen. Addresses 
were given as follows: Professor Helen T. Woolley, Director, Institute 
of Child Welfare Research, Teachers College, ‘“Recent Changes in the 
Status and Attitudes of Children;’’ Dr. Ernest R. Groves, Research 
Professor of Sociology, University of North Carolina, ‘‘Recent Changes 
in the Status and Attitudes of Youth;’’ Mrs. Sidonie M. Gruenberg, 
Director, Child Study Association of America, ‘“Recent Changes in the 
Status and Attitudes of Parents;’’ Mr. Porter R. Lee, Director of New 
York School of Social Work, ‘‘Parents as Factors in the Formation of the 
Community’s Attitudes.” 

As a part of the afternoon session Dr. W. E. Blatz, Director of St. 
George’s School for Child Study, University of Toronto, and many 
others reported upon current programs in parental education. Some of 
the speakers following the dinner were Drs. Felix Adler, Edward L. 
Thorndike, William Heard Kilpatrick, Bernard Glueck, Anna Garlin 
Spencer, Dean William F. Russell and Professor Patty Smith Hill. 
The current issue of Child Study commemorates the Fortieth Anniver- 
sary of the Child Study Association of America. 


A new undergraduate program at Columbia College went into effect 
September, 1928. This program is due to the vigorous and progressive 
leadership of Dean Hawkes. Since President Eliot attacked the old, 
narrow and prescribed course of study with his arguments for an elec- 
tive system more than half a century ago the undergraduate program 
in this country has been in a state of flux. As yet no plan has been 
worked out which gives general satisfaction and which does for the pres- 
ent-day undergraduate what the old classical course succeeded in doing. 
After working for most of the past year the faculty of Columbia College 
has planned the work of the first two years so that it will be preliminary 
and exploratory. Freshmen and sophomores completing this work will 
have gained a good general Junior College education. The work of the 
next two years is planned so as to give it a genuine university character. 
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This will prepare either for the professional or the graduate school 
courses or for the yet more serious business of living a useful and high- 
minded life. 

Effort has been made to measure the progress of the student in real 
achievement as over against courses, hours or points. Any student 
who shows himself able to omit any of the courses ordinarily prescribed 
for undergraduate students will be encouraged to doso. This will make 
a place for studies for which he is ready and in which he is interested. 

Two other interesting experiments will be undertaken, first, lecture 
courses for which no prerequisite or examination will be required; 
second, reading courses to be offered in coéperation by two or three 
instructors in different but somewhat allied departments. Members of 
other institutions will watch with interest these experiments in college 
and university education. 


The United States Public Health Service with the codperation of the 
Department of Health and Education of the District of Columbia has 
recently completed a thorough examination of the vision of 1860 white 
school children in Washington between the ages of six and sixteen years. 
Only 3.4 per cent of the children were found with eyes free from defects. 
Far-sightedness (hyperopia) was found in 63 per cent of the defective 
cases. Near-sightedness (myopia) affected 5.5 per cent, and astigma- 
tism 28 per cent. Experts concluded that glasses were needed by 34 per 
cent of the entire group. The investigation appears to show that the 
simple visual acuity test will reveal but a small percentage of the actual 
number of refractory errors in children. Thus the simple letter tests 
made in schools while of much value fail to reveal particularly the latent 
cases. Eye strains become worse as the child advances and later the 
individual goes out into commercial and other work much handicapped. 

The above is only a sample of the valuable investigations now being 
carried on by the Eye Sight Conservation Council of America, The 
Times Building, New York City. 


Dr. Harold Ellis Jones, Director of Research at the Institute of Child 
Welfare, University of California, has been appointed Associate Profes- 
sor of Psychology in that institution. 


The following appointments to the staff of the Research Department 
of The Training School at Vineland are announced for the current year: 
Elizabeth J. Jewell, A.B., Vassar, assistant research clinician; Ruth T. 
Melcher, M.A., University of Kentucky, senior research fellow; Amor- 
ette E. Wolcott, M.A., Columbia, research fellow; Margaret B. Torrey, 
B.S. University of New Hampshire, research fellow. 
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A trade practice conference for publishers of periodicals was held at 
the Waldorf-Astoria Hotel, New York City, on October 9, 1928. The 
Hon. William E. Humphrey, Attorney Martin A. Morrison and M. 
Markham Flannery, Director of Trade Practice Conference, presided. 
Practically ‘the entire periodical publishing field was represented in 
person or through associations. The purpose of the conference was not 
in any degree to establish a censorship over the press. Any plan 
adopted was to be the expression only of the judgment, conscience and 
purpose of those who adopt it. 

The following is a part of the resolution as finally amended and 
unanimously adopted. It is an excerpt from a statement made by 
Chairman Humphrey: 

‘The majority of the periodical publishers not only obey the law but 
often go far beyond what the law requires in selecting the advertisements 
they will publish. I do not believe there is an industry in America 
conducted by more honest, high-minded, public spirited men and women 
than the publication industry. I do not believe that any industry in 
America has greater power for good. I believe that the future greatness 
and security of the nation rests to a greater extent upon the publishing 
industry than probably any other.”’ 

The National Better Business Bureau was selected by the publishers 
as the machinery through which its industry would do its own policing of 
the periodical field with reference to standards for advertisements. 


At the Conference of Experimental Psychology held on March 30 and 
31, 1928, at Carlisle, Pa., under the auspices of Anthropology and 
Psychology of the National Research Council, the desirability of a 
conference on psychological journals was discussed. Growing out of 
this discussion a conference of editors and managers of psychological 
journals has been called for November 30 and December 1 in Washing- 
ton, D.C. This conference is also to be held under this same division 
of the National Research Council and is called by Professor Knight Dun- 
lap, Chairman. 
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BOOK REVIEWS 


Mary Bue.t Sayies. The Problem Child at Home: A Study in Parent- 
Child Relationships. The Commonwealth Fund, Division of 
Publications, New York City, 1928. 342 pp. 

This book follows The Problem Child in School by the same author. 
The Problem Child at Home deals with the interpretation of data taken 
from clinical records of some two hundred cases during a five-year 
period. It also presents twelve illustrative cases of problem children 
and, briefly, the treatment of these cases. The author thinks that the 
case materials presented are representative. She arrives at this con- 
clusion after receiving suggestions from other clinical workers; also, 
through information brought to light upon attendance through a long 
series of clinical case conferences. She admits, however, that one 
cannot be certain that these cases are fair samples. 

The plan of this book is good. The author has wisely refrained from 
attempting to present the methods of treatment of the cases and other 
data which may have been gathered during the interviews and examina- 
tions. 

The book is divided into three parts, as follows: Part I is characterized 
as the Emotional Satisfactions which Parents and Children Seek in One 
Another; Part II treats of Mistaken Ideas which Influence Parent-Child 
Relationships; Part III presents in narrative form twelve case histories 
together with brief comments on the treatment of these cases. Parts I 
and II might be grouped together as both have to do with the interpre- 
tative phase of parent-child relationships. 

While the reviewer does not always agree in the use of terminology, 
particularly in the description of the ‘‘emotional needs’’ of childhood 
and the resulting maladjustments which follow when a failure to meet 
these needs occurs, the author has done well to emphasize some of the 
observable emotional tendencies which seem almost always tied up in 
some form or other with the problem child. In Chapter 1 is listed the 
“emotional needs” of childhood: (1) need for security-love and harmony 
between parents and between parents and child are essential in order to 
meet this need; (2) need for growth—for freedom and opportunity to 
grow; (3) need of a concrete ideal embodied in the parents. A failure on 
the part of the parents to meet these needs results in emotional habits 
which are maladjusting in later life. One cannot assume a set of needs 
common for all parents as one is able to do in the case of children, for the 
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satisfactions sought by parents in their first born are determined largely 
by how their emotional needs were met when they were children. Here 
it seems is a recognition of the fact that emotional tendencies (tenden- 
cies toward repression, etc.) begin early and last not only through the 
life of the individual but these response tendencies color the early life 
situations of the children of these individuals and are often responsible 
for similar tendencies being built up in the offspring. Thus the cycle is 
completed only to begin again and truly it may be said, ‘‘The sins of the 
fathers may be visited upon their children, even unto the third and 
fourth generations.” 

Further discussion of the parent-child relationship is continued. 
Topics like jealousy, favoritism, antagonism, and ambitions are inter- 
preted in the author’s own way. The tendency to be ambitious for 
their children may be compensatory for parents. A sort of ‘‘vicarious 
functioning”’ (reviewer’s term) occurs. Of course the emotional tend- 
ency may manifest itself in a number of ways when life situations have 
thwarted the parents’ plans but the father usually finds satisfaction in 
having the son do or ‘‘be’’ what he wanted most to do or be and couldn’t; 
and this without regarding the ability or the inclinations of the son. 
The father wanted to be a doctor but couldn’t. “I am going to make a 
doctor out of my son,’’ he says. While not always pointed out by the 
author the psychological implications here are many. 

The book is rich in illustrative and case material. All through it, 
however, one gets the impression that the real problem of the social 
worker, the psychiatrist, the psychologist, is to be found not only with 
the child but with the parents and the home life as well. While this view 
is perhaps held by most clinical workers this author furnishes concrete 
materials to support this generalization. The book is informative and 
readable. 

James R. Patrick, 
Ohio University. 


Artuur C. Jacosson. Genius, Some Revaluations. New York: Green- 
berg. 160 pp. $2.50. 

The author predicates alcohol, unstable ancestry, multiple person- 
ality, ethnic mixtures, physical debility and tuberculosis as the neces- 
sary antecedents of productive genius. Since the reviewer, in so far as he 
is aware, does not possess such prerequisites he warns the reader that 
the review will be ‘‘dull, mediocre, Rotarian and Coolidge-like”’ and if 
the reader continues it will be at his own risk. 

The author gloomily looks forward to the time when prohibition, 
eugenics, social programs, etc., will eradicate genius and reduce the 
world to the level of the “‘stale, flat and unprofitable.’’ The reviewer 
refers him to the Democratic candidate for the presidency for evidence 
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that there is still enough alcohol left for a sonnet or two; to the Red 
Cross for proof that there is enough tuberculosis in existence to repro- 
duce all literature; to any place in the United States for evidence that 
ethnic mixtures are still being made; to the writing of the eugenicists 
for the belief that their programs are more read than practiced; to any 
psychiatrist for the encouragement that mental instability exists in 
abundance. The reviewer can find plenty of woe in which to rejoice and 
fails to see why the author is so gloomy over the prospect of a geniusless 
world—provided, of course, that genius does come by the route which 
our author believes. 

The discussion of genius is almost entirely confined to literary men. 
This raised the interesting question in the mind of the reviewer as to 
whether a genius in the field of science is also a product of the above 
recited conditions. The author confesses to a belief that literature is 
the highest and best form of the expression of genius and it may be 
possible that he ignores most scientists as being unworthy of the title of 
“genius.”” At any rate the reviewer confesses to a great deal of curi- 
osity as to whether as many labels of degeneracy can be pinned upon 
eminent scientists as upon authors of note. Can solid facts be snatched 
from nature through the medium of a secondary personality in the same 
fashion that literary beauty and fancy are? Would a scientist released 
by alcohol from the limitations.of inhibition spy upon and observe the 
inner nature of an atom and report his observations accurately? To 
bring the matter more to a point, is it not possible that artistic genius 
may be more dependent upon the release of fancy from its inhibitions 
than is genius in other fields? Art which does not tug at the emotions 
is not art; therefore the genius in art has need of many emotions. But 
science which is not verifiable is not science, and therefore the scientist 
would better refrain from stimulants while making his observations 
lest they be multiple when they ought to be single. 

Undoubtedly there have been many cases of release from inhibition 
through multiple personalities, alcohol, ete., in literary fields. The 
author cites many examples, but somehow leaves the impression that 
he is determined to find one of his ‘‘causes’”’ for genius whether it exists 
ornot. For example the author admits that the biographers of Schiller 
have been in doubt about the nature of the malady which afflicted him, 
byt for the author there is no question but that Schiller had phthisis. 
The reviewer makes no pretense to a medical understanding of the 
case, but having observed the medical profession disagreeing frequently 
upon the diagnosis of the trouble of a living person, he feels disinclined 
to trust a post-mortem examination made at this late date and based 
upon written and disagreeing reports of the case. The same thing may 
be said of still other cases cited by the author. Every historian is aware 
that legends grow rapidly about the personality of any great man. The 
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reviewer wonders at times whether the author has sufficiently dis- 
counted the stories of the eccentricities and habits of the great and 
near great of whom he writes. Furthermore is it not possible to find 
some flaw in the life or ancestry of every person? Are these traits 
peculiar to genius? Obviously not! Indeed our author admits that 
these factors are but the soil in which genius develops. There still 
remains the problem then of explaining why of two drinkers one becomes 
a poet and the other a sot. The reader of the book will be interested, 
but he will eventually lay down the book with the feeling that it is too 
uncritical to be a real “revaluation of genius.’’ 
Srvuart M. Sroxe, 
Ohio University. 


Gates, Georaina StickLanp. The Modern Cat: Her Mind and Manners. 
New York: The Macmillan Company, 1928. pp. ix + 196. 

In December, 1922, at the annual convention of the A. A. A. S. Prof. 
James Harvey Robinson made a plea for the humanization of knowledge. 
Professor Robinson argued that it was not enough to find out facts and 
write them down in scientific treatises for other scientists; he insisted 
that scientific discovery and thought must be made the property of all 
people. The bishop of Ripon may have had a similar thought in mind 
when he recently asserted that science might well take a ten-year 
holiday. This somewhat foolish statement was probably based upon 
the conviction that vast amounts of undigested knowledge are already 
available and that the next needed step is that of bringing scientific 
knowledge to the masses. 

Here is a duty which the scientist (or somebody) owes to the common 
people but a duty which is not easily fulfilled. For it may be said, 
without casting reflections upon any one, that writing is an art which 
many scientists do not possess. The problem of course is to humanize 
or to democratize knowledge, not to cheapen or to adulterate knowledge. 
And herein lies the difficulty. 

Slosson’s Creative Chemistry, Thomson’s Outline of Science, Browne's 
This Believing World, Durant’s The Story of Philosophy, and Dorsey's 
Why We Behave Like Human Beings, are all interesting and refreshing 
examples of successful attempts at popularization. Unfortunately, it 
would be easy to assemble a much longer list of authors whose attempts 
at popularization have been abortive. 

In a recent volume Dr. G. S. Gates has attempted ‘‘to collect from the 
literature of psychology those facts which are pertinent to a discussion 
of cat behavior and the cat mind” (p. viii). The problem of method- 
ology is presented in so simple and yet so charming a manner that it 
should arouse the interest of the most indifferent student. The results 
of experimental studies are then reviewed and this account is followed 
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by an attempt to interpret cat behavior. The book is intended for two 
classes of readers; ‘‘those who love cats and would like to know more 
about the explanation of their actions, and those who may wish to 
obtain, through a description of methods employed with one animal, a 
first glimpse of the ways of comparative psychology”’ (p. ix). 

Some difficulties are of course to be expected in a presentation of this 
general type. On the whole Dr. Gates has met these problems in better 
fashion than one might suppose. Her chief aim has been that of carry- 
ing to the non-technically educated layman the wealth of scientific 
information available regarding the cat. And it must be admitted that 
she has told the story in a seductive manner. More books of this sort 
would surely do no harm to psychology. 

Harvey C. Lexan, 
Ohio University. 


FREDERICK Woop Jones AND Stanuey D. Portevs. The Matriz of the 
Mind. University Press Association, University of Hawaii, Hono- 
lulu, 1928. Pp. 457. 

The purpose of this book as stated in the preface is the blending of 
neurology and psychology. That the authors have accomplished their 
purpose thoroughly no one can doubt. There may be some difference 
of opinion, however, as to the scientific accuracy of certain parts of the 
work. The first part of the book, written by Wood Jones and dealing 
especially with the nervous system, traces its evolution and embryonic 
development in some detail. Some may object to the anthropomor- 
phisms with which this part of the book abounds, the author seeming to 
endow nervous tissue, primitive forms of animal life, and natural proc- 
esses with intention and purpose. For example he says: “There is a 
choice offered to the animal type; what area of its surface shall it elect 
to set aside for this purpose of providing cells which are to make an 
ordered invasion of its body. In all animal types the election has not 
been the same; some have chosen one area, some another.’’ And else- 
where: “‘A new type of animal was tried, and, as the skin of the front or 
ventral surface had proved a failure, the skin of the back or dorsal sur- 
face was tried.’’ The author uses the term ‘‘cytoclesis’’ as the name 
for “the call of cell to cell,”’ and says that Kappers’ neurobiotaxis is in 
reality nothing but a special case of cytoclesis. He uses the terms 
“‘dendron”’ and ‘‘neuraxon”’ in place of the more common dendrite and 
axone. 

The second part, by Porteus, is more directly psychological. Object- 
ing to the term instinct as being a blanket for our ignorance he discusses 
the phenomena usually subsumed under this head in a chapter entitled 
‘Built-in Habits.”’ These are ‘‘the learned reactions of the race which 
are the unlearned reactions of the individual.’’ The theory is advanced 
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here that many built-in habits, e.g., those connected with the preserva- 
tion of the race, originated as self-preservative behavior. ‘Instead of 
regarding the habit of the cow in seeking concealment prior to parturi- 
tion as indicating anticipatory provision for the calf, it should be 
thought ‘of as originally an impulse which favors her own protection 
during temporary helplessness against her natural enemies or those of 
her own kind. By racial conditioning the sensations preceding parturi- 
tion have become the cue for the habit to appear.”’ 

In his discussion of intelligence the author stresses that the highest 
function of the cortex is to enable the animal to make new adaptations to 
situations. In this he declares himself to part company with the 
majority of psychologists whose accepted definition of intelligence is 
“general mental adaptability to a new situation.’’ He regards it as 
unwise to lay so much emphasis on the newness of the situation instead 
of on the adaptive response. 

The book represents a much needed type of endeavor. Neurologists 
and psychologists alike realize that they have many problems in common 
and many which neither can profitably attempt alone. Probably the 
present work is the best we can expect until an experimental attack has 
more thoroughly opened up the field where the two subjects overlap. 
The authors say that it is intended to provide a common background 
upon which further studies of human material are to be projected. As 
such it should serve admirably. 

Amos C. ANDERSON, 
Ohio University. 














SSeS wea 


aarp iA 


Boas 

















SS 


oS ee 


As ee Sings ip ta EE Sw 


NEW BOOKS AND PAMPHLETS RECEIVED* 


Books and pamphlets for review should be sent to James P. Porter, 
Department of Psychology, Ohio University, Athens, Ohio. 


The Cave Man’s Legacy. E. Hansury Hankin. 180 pp. 

Child Labor in Mississippi. Cuartes E. Grppons. National Child 
Labor Committee, 215 Fourth Avenue, New York City. 25 cents. 
34 pp. 

Conference on Experimental Psychology. National Research Council, 
B and 21st Streets, Washington, D.C. 114 pp. 

The Development of Children’s Number Ideas in the Primary Grades. 
Wituram A. Browne... University of Chicago, Chicago, IIl. 
241 pp. 

Dictionary of Psychiatric Clinics for Children in the United States. The 
Commonwealth Fund, Division of Publications, New York City. 
181 pp. 

Educational Activities of New England Quakers. Zora Kuan. West- 
brook Publishing Company, Philadelphia, Pa. 228 pp. 

Foibles of Insects and Men. Witi1am Morton Wueever. Alfred A. 
Knopf, New York City. 217 pp. 

Fundamentals of Objective Psychology. Joun Frepertck DasHIELL. 
Houghton Mifflin Company, New York City. $3.00. 588 pp. 
Genius. Artuur C. Jacosson. Greenberg, Publishers, New York 

City. $2.50. 160 pp. 

The Hand and the Mind. M.N. Larran. E. P. Dutton & Company, 
681 Fifth Avenue, New York City. 96 pp. 

An Investigation of Practices in First Grade Admission and Promotion. 
Mary M.Reep. Bureau of Publications, Teachers College, Colum- 
bia University, New York City. 135 pp. 

Modern Psychology—Normal and Abnormal. Daniet Beiut Leary. 
J. B. Lippincott Company, Philadelphia, Pa. 441 pp. 

New Introduction to Science. Bertua M. Cuiark. American Book 
Company. 480 pp. 

The New Morality. Durant Drake. The MacmillanCompany. $2.50. 
359 pp. 





* Mention here does not preclude further comment. 
652 














NEW BOOKS AND PAMPHLETS RECEIVED 653 


Present Day Law Schools in The United States and Canada. ALFRED Z. 
Reep. Carnegie Foundation for the Advancement of Teaching, 
Bulletin No. 21, 1928, 522 Fifth Avenue, New York City. 598 pp. 

Prinzipien und Methoden der Kunstpsychologie. Pau Puaut. Urban 
& Schwarzenberg, Fredrickstrasse 105 b, Berlin, Germany. 220 pp. 

The Problem Child at Home. Mary Bureut Sayies. Commonwealth 
Fund, Division of Publications, New York City, $1.50. 342 pp. 

Psychologische Begutachtung der Erwerbsbeschrankten. WauttreR Pop- 
PELREUTER-BoNN. Urban & Schwarzenberg, Berlin, Germany. 
183. pp. 

The Psychology of Language. Wauter B. Pituspury AND CLARENCE L. 
Meapver. D. Appleton Company. $3.00. 306 pp. 

Le Réve. Dr. Martin Gomes. Rodregues & Company, Rio de Janeiro. 
179 pp. 

The Study of Religion in State Universities. Hersert Leon Sear.es. 
University of Iowa Studies, Vol. 1, No.3. Iowa City, Iowa. 91 pp. 

Untruthfulness in Children: Its Conditioning Factors and Its Setting in 
Child Nature. W.E.Stacut. University of Iowa, Studies, Vol. I, 
No. 4. University of lowa, Iowa City, Iowa. 79 pp. 








Was 











INDEX OF SUBJECTS 


Titles of books are followed by the name of the author of the 
work reviewed. All other references are to original articles. 


Ability grouping in the Junior 
High School (Ryan and Cre- 
celius), 250. 

About Ourselves (Overstreet), 532. 

Achievement, full-blood Indians, 
intelligence and, 511; —— of 
normal school students, intelli- 
gence and, 335. 

Accuracy, factors, objective tests, 
general psychology, speed and, 
636. 

Adolescence, organization, mental 
and physical traits during, 228. 
Advertising, technique, psycho- 
logical study, poster board, 43. 
Aesthetics, affective reactions, 

painting reproductions, 125. 

Apprehension affected by inter- 
letter hair-spacing and charac- 
teristics, individual letters, 
range of, 82. 

Attention and its testing, distribu- 
tion of, 495; —— effects of 
fatigue on distribution of, 595. 


Back-bone titles, legibility of, 217. 

Behavior, score card, personal, 
140. 

Binet test, graphic summary, 
Stanford-, 343. 

Capacity, industries, school, test 
for motor, 169; ——, intelligence 
in school children, relation be- 
tween cranial, 524. 


Classification of pupils in private 
Schools (Rogers), 156. 

Children, college preparatory 
courses, progress, elementary 
school, relation of class room 
success, 429; ——, overstate- 
ment in third-grade, 404; ——, 
relation between cranial capac- 
ity and intelligence in school, 
524. 

College freshmen, factors affect- 
ing success, 517; —— prepara- 
tory courses, progress, elemen- 
tary school, relation class room 
success of children, 429. 

Cranial capacity and intelligence, 
school children, 524. 


Differences, high school seniors, 
psychological tests, sex, 56. 

Dreams (Stiles), 534. 

Duo-art, laboratory instrument, 
214. 


Educational Psychology (Jordan), 
454. 

Efficiency in reading, numerals 
versus words, 190. 

Elementary psychology, an (Phil- 
lips), 155. 


Familiarity, retail stores, associa- 
tion test of, 437. 

Fatigue on distribution of atten- 
tion, effects, 595. 


655 


CRN LP to ne 











656 INDEX OF SUBJECTS 


Feelings, effects, continuous work 
upon output and, 459. 

Form, speed of reading, influence 
of type, 359. 

Freshmen, factors affecting suc- 
cess of college, 517. 

Full-blood Indians, intelligence 
and achievement of, 511. 

Fundamentals of human motiva- 
tion (Troland), 538. 


Grades, personality revealed by 
mental test scores and, 261. 


Hair-spacing and by character- 
istics of individual letters, range 
of apprehension as affected by 
inter-letter, 82 

High school seniors, psychological 
tests, sex differences in, 56. 

How to do research in education 
(Good), 535. 


Indians, intelligence and achieve- 
ment, full-blood, 511. 

Industries and in school, test for 
motor capacity, 169. 

Instrument, duo-art, laboratory, 
214. 

Intelligence and achievement, 
full-blood Indians, 511; ——, 
measurement of social, 317 ; ——, 
school children, relation be- 
tween cranial capacity and, 


524; ——, study of play, relation 
to, 369; ——, validity of test of 
social, 426. 


Intelligence tests compared with 
seven others, Kuhlmann-Ander- 
son, 545; —— scores, power and 
speed, their influence upon, 631; 
—— Scores and success, rational 
organization problems, correla- 
tion between, 621; ——, their 


significance for school and so- 
ciety, 456. 

Intensities, dyad preferences, 148. 

Inter-letter hair-spacing and by 
characteristics, individual let- 
ters, range of apprehension as 
affected by, 82. 

Interpretation of  ediicational 
measurements (Kelley), 160. 

Investigations in the hygiene of 
reading (Blackhurst), 164. 


Laboratory instrument, duo-art, 
214. 

Letters, range of apprehension as 
affected by inter-letter hair- 
spacing and characteristics of 
individual, 82. 

Lure of superiority, the (Vaughn), 
451. 


Mental life, the—a survey of 
modern experimental psychol- 
ogy (Ruckmick), 353. 

Mental and physical traits during 
adolescence, organization, 228. 
Mental test scores and by school 
grades, personality revealed by, 

261. 

Method, analyzing musical style, 
reproducing piano, 200. 

Methoden der wirtshafts, psycho- 
logie, 539. 

Morale, training, public presenta- 
tion, school exercise: school, 417. 

Morals in Review (Rogers), 162. 

Motor capacity, industries, school, 
test for, 169. 

Musical style, reproducing piano, 
200; —— talent, reliability and 
validity of Seashore tests, 
468. 

My life transformed (Heckman), 
350. 


INDEX OF SUBJECTS 


Normal school students, intelli- 
gence and achievement, 335. 


Objective tests, elementary psy- 
chology, comparison, five types, 
398; —— tests, general psychol- 
ogy, speed and accuracy as 
factors in, 636. 

Output and feelings, effects, con- 
tinuous work, 459. 


Performance scale, restandardi- 
zation of point, 278. 

Personal behavior, score card, 
140; —— problems, college stu- 
dent, 1. 

Philosophy (Russell), 252. 

Physical traits during adolescence, 
organization of mental and, 228. 

Piano, analyzing musical style, 
reproducing, 200. 

Play, relation to intelligence, 369 

Point performance scale, restand- 
ardization, 278. 

Practical psychology (Robinson), 
255. 

Preferences, different intensities, 
dyad, 148. 

Pressey X-O tests, reliability of, 
477. 

Principles of abnormal psychology 
(Conklin), 351. 

Problems, college student person- 
nel, analytical study, 1; ——, 
correlation between intelligence 
test scores and success, rational 
organization, 621. 

Professional and business ethics 
(Taeusch), 251. 

Progress, elementary school, rela- 
tion of class room success, 
children, college preparatory 
courses, 429. 

Propaganda technique in the 
world war (Lasswell), 249. 


657 


Psychological care of infant and 
child (Watson), 354; —— study 
of poster board advertising, 
technique, 43; —— tests, sex 
differences in 5925 high school 
seniors, 56. 

Psychologie, expérimentale (Piér- 
on), 256. 

Psychology by experiment (Kline 
and Kline), 154; —— of indi- 
vidual differences, the (R. 8S. 
Ellis), 540; —— of personality, 
the (Bagby), 453; ——, the 
science of mental activity 
(Lund), 155; of aesthetics, 
affective reactions, painting 
reproductions, 125; ——, com- 
parison, five types, objective 
tests, elementary, 398; ——, 
speed and accuracy, factors, ob- 
jective tests in general, 636; 
——, why students register for, 
242. 

Public presentation, school exer- 
cise, training in; school morale, 
417; —— speaking, modes of 
emphasis, 611. 


Rational organization problems, 
correlation between intelligence 
test scores and success in, 621. 

Reactions, painted reproductions, 
125. 

Reading, influence, type form on 
speed of, 359; ——, numerals 
versus words, efficiency in, 190. 

Reliability, Pressey X-O 
477. 

Reproducing piano, method of 
analyzing musical style, 200. 

Research adventures in university 
teaching (Pressey, Worcester, 
Chambers, et al.), 158. 

Retail stores, association test of 
relative familiarity, 437. 


test, 





i 
i 
1} 
\ 














658 INDEX OF SUBJECTS 


Scale, restandardization, point 
performance, 278. 

Scales for rating pupils, answers to 
nine types of thought questions 
in general science, American 
history, civics and English liter- 
ature (Odell), 157. 

School children, relation between 
cranial capacity and intelli- 
gence, 524; —— exercise: school 
morale, training in public pres- 
entation, 417; -—— _ grades, 
personality as revealed by men- 
tal test scores and by, 261; —— 
morale, training in public pres- 
entation of school exercise, 417; 
——, relation of class room suc- 
cess, children, college prepara- 
tory courses, rate of progress, 
elementary, 429; —— students, 
intelligence and achievement of 
normal, 335; ——, test for motor 
capacity in industries and, 169. 

Scores and by school grades, per- 
sonality, revealed by mental 
test, 261; —— and success, 
rational organization problems, 
correlation between intelligence 
test, 621; ——, power and 
speed, their influence upon 
intelligence test, 631. 

Seashore tests, musical talent, 
reliability and validity of, 468. 
Sex differences in school children 

(Lincoln), 251. 

Social intelligence, measurement, 
317 ; —— intelligence, validity of 
test, 426. 

Speaking, modes of emphasis in 
public, 611. 

Speed of reading, influence of type 
form, 359; ———- their influence 
upon intelligence test scores, 
power and, 631. 


Stanford-Binet test, graphic sum- 
mary, 343. 

Statistical methods for students in 
education (Holzinger), 450. 

Sterilization in California, eugenic, 
304. 

Stores, association test of relative 
familiarity of retail, 437. 

Student personal problem, col- 
lege, 1. 

Students discriminate traits asso- 
ciated with success in teaching, 
602. 

Studies in vocational information 
(Bate and Wilson), 536. 

Success of children in college 
preparatory courses to progress, 
elementary school, relation of, 
429; ——, college freshmen, 
factors affecting, 517; —— in 
teaching, can students discrim- 
inate traits associated with, 602. 

Supplementary reading assign- 
ment, the (Good), 163. 


Talent, reliability and validity, 
Seashore tests, musical, 468. 

Teaching, can students discrim- 
inate traits associated with 
success in, 602. 

Test, graphic summary of Stan- 
ford-Binet, 343; ——, relative 
familiarity, retail stores, asso- 
ciation, 437; ——, reliability of 
Pressey X-O, 477; —— scores 
and by school grades, person- 
ality revealed by mental, 261; 
—— scores and success, rational 
organization problems, correla- 
tion between intelligence, 621; 
—— scores, power and speed, 
their influence upon intelli- 
gence, 631; ——, social intelli- 
gence, validity of, 426. 


INDEX OF SUBJECTS 


Testing, distribution of attention 
and its, 495. 

Tests compared with seven others, 
Kuhlmann-Anderson, 545; ——, 
elementary psychology, com- 
parison, five types, objective, 


398; ——, general psychology, 
speed and accuracy, factors in 
objective, 636; —— of musical 


talent, reliability and validity 
of Seashore, 468; ——, sex dif- 
ferences, high school seniors, 
psychological, 56. 

That mind of yours, (Leary), 544. 

Third-grade children, overstate- 
ment in, 404. 

Titles, legibility of back-bone, 
217. 

Traits associated with success in 
teaching, can students discrimi- 
nate, 602. 


659 


Type form, speed of reading, in- 
fluence, 359. 

Types, objective tests, elementary 
psychology, comparison, 398. 


Understanding human 
(Adler), 352. 


nature 


Validity, Seashore tests of musical 
talent, reliability and, 468; ——, 
test of social intelligence, 426. 


Words for efficiency in reading, 
numerals versus, 190. 

Work, output and feelings, effects 
of continuous, 459. 


Your growing child (Bruce), 355. 











: 
> } 
t 
- 


© Sgn pene re 2+ 


INDEX OF AUTHORS es 


The names of authors of original contributions are printed in 


CAPITALS AND SMALL CAPITALS. 


ABELL, WENDELL, 511. 
Apams, Henry F., 261. 
Aikins, H. Austin, 535. 
Anderson, Amos C., 454, 651. 
Artuur, GRACE, 278. 
Asner, E. J., 437 
Berar, Ropert M., 517. 
Beck, H. C., 217. 
Berk, A. K., 429. 
BemMELs, VIOLET, 404. 
Bird, Charles, 250. 
Book, Wriu1aM F., 56, 538. 
Brooks, Fowuer D., 228. 
Broom, M. Eustace, 426. 
BROTEMARKLE, R. A., 1. 
Brown, A. W., 468. 
Burtt, H. E., 43, 217, 258, 540. 
CAMPBELL, E., 217. 
Cures, Joun W., 398. 
Conklin, Edmund §&., 353. 
Crockett, T. §., 43. 
Cros.anp, H. R., 82. 
DeBow, L. A., 261. 
Enouisu, Horace B., 242. 
Estasrooks, G. H., 524. 
Farnsworts, P. R., 148, 214. 
Fenton, Jessie C., 356 
Fenton, NorMAN, 352, 417. 
Freeman, Frank §., 631. 
Furniss, Lovis, 261. 
Gamertsfelder, Walter §8., 163, 
254. 
Garnison, K. C., 621. 
Gartu, Tuomas R., 511. 
Greene, Epwarp B., 343. 


Hansen, Einar A., 165. 
Hunt, THevma, 317. 
IsraELI, NATHAN, 125. 
Jounson, Grorata, 82. 
KvosLMANN, F., 545. 
Leaman, Harvey C., 160, 369, 
453, 538, 650. 
LonastaFr, H. P., 636. 
Meapows, Joun L., 56. 
Morton, R. L., 451. 
Mosner, Rayrmonp M., 335. 
Paterson, Donan G., 359. 
Patrick, James R., 647. 
Pintner, R., 351. 
PorrenBercer, A. T., 459. 
Porenog, Pavt, 304. 
Porter, James P., 636. 
Remmers, H. H., 477, 602. 
Rvcu, G. M., 398. 
Scuuitz, Ricuarp §., 169, 
Sias, A. B., 162. 
Suitx, Hare W., 511. 
STraunaker, J. M., 602. 
Stoke, Stuart M., 158, 251, 455, 
456, 541, 649. 
Tuompson, Lorin A., 477. 
Tinker, Mixes A., 190, 359. 
Vaughn, James, 354. 
Verwoerp, H. F., 495, 595. 
Vorce in, C. F., 148. 
Wuiprp te, Goy A., 200. 
Wilson, William R., 156, 
Wirry, Pavut A., 369. 
Wooprow, Hercert, 404. 
Yepsen, Luoyn N., 140. 











